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SUMMARY 

In both rich and poor nations, public resources for health care are inadequate 
to meet demand. Policy makers and health care providers must determine 
how to provide the most effective health care to citizens using the limited 
resources that are available. This chapter describes current and future 
challenges in the delivery of health care, and outlines the role that operations 
research (OR) models can play in helping to solve those problems. The 
chapter concludes with an overview of this book - its intended audience, the 
areas covered, and a description of the subsequent chapters. 

KEY WORDS 

Health care delivery, Health care planning 




HEALTH CARE DELIVERY: PROBLEMS AND CHALLENGES 



3 



1.1 WORLDWIDE HEALTH: THE PAST 50 YEARS 

Human health has improved significantly in the last 50 years. In 1950, 
global life expectancy was 46 years [1]. That figure rose to 61 years by 
1980 and to 67 years by 1998 [2]. Much of these gains occurred in low- and 
middle-income countries, and were due in large part to improved nutrition 
and sanitation, medical innovations, and improvements in public health 
infrastructure. 

However, not all countries have experienced an increase in life expectancy 
in recent years. In countries of the former Soviet Union, life expectancy 
dropped from 70 years in 1986 to 64 years in 1994, with an even more 
marked drop among men [3]. Factors contributing to this decline include 
economic and social instability, high rates of tobacco and alcohol 
consumption, poor nutrition, depression, and deterioration of the health care 
system [4J. In many African nations, life expectancy has been significantly 
diminished by HIV/AIDS. In seven African countries life expectancy is now 
less than 40 years and falling [5J. 

Worldwide, infectious diseases kill 13 million people per year [6]. In 1999, 
2.8 million people died from AIDS alone [7J. Infectious diseases once 
confined to specific geographic regions have spread across country borders 
as a result of increasing global travel. New infectious diseases continue to 
emerge [8J. Noncommunicable diseases such as heart disease, cerebro- 
vascular disease (stroke), cancer, and diabetes are the primary cause of death 
in high-income countries. Such diseases currently account for less than half 
of all deaths in low-income countries, but in the next 20 years are expected 
to account for 70% of deaths [9]. In low-income countries, malnutrition 
remains a serious health problem, whereas in high-income countries, obesity 
is increasingly becoming a health problem. Tobacco, alcohol, and drug use 
have led to significant health problems worldwide. Tobacco use currently 
accounts for almost 5 million premature deaths per year. This figure is 
projected to rise to more than 8 million deaths per year by 2020, with many 
of these in low- and middle-income countries [10]. Food and water 
contamination cause at least 2 million premature deaths per year, primarily 
in low- and middle-income countries. Environmental agents (e.g., arsenic, 
lead, silica) also pose significant health risks in some areas. In addition, 
manmade chemical and biological weapons are a potential threat to public 
health. 

1.2 HEALTH CARE DELIVERY CHALLENGES 

Governments and health care providers face a variety of challenges in the 
delivery of health care. Below we describe current and future health care 
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challenges. Because low- and middle-income countries face significantly 
different challenges in health provision than do high-income countries, we 
describe their current health care challenges separately. We then describe 
the future challenges in health care delivery that are common to all 
countries. 

1.2.1 Current health care delivery challenges in low-income and middle- 
income countries 

In low- and middle-income countries, where 80% of the world’s population 
lives, malnutrition and infectious diseases account for significant numbers of 
premature deaths. Half of young child deaths in low-income countries are 
caused by malnutrition [11]. Although vaccines are available for a number 
of infectious diseases that cause childhood deaths, 25% of the world’s 
children have not received these vaccines [12]. Many people in low- and 
middle-income countries do not receive even basic health care. Health 
facilities are often located in urban areas, far from rural areas and frequently 
difficult to access by public transportation. The care that is provided can be 
costly and substandard. In recent years, low- and middle-income countries 
have seen a significant shift in population from rural to urban areas, but have 
had no commensurate increase in urban health services. Inadequate 
infrastructure (e.g., inadequate roads, storage and distribution systems, 
electricity, clean water) and poorly functioning public health systems also 
impede the provision of health care. 

Resources for health care in low-income countries are quite limited. Among 
the world’s 60 poorest nations, annual per capita health spending in the year 
2000 was less than $15 [13], and approximately one third of this funding 
came from international aid. Such an amount is insufficient to provide even 
the most basic health services. In contrast, annual per capita health spending 
in the industrialized world was on the order of $2,000, and was $4,500 in the 
United States [13]. Even if low-income countries were to devote more of 
their scarce public funds to health care, as recently recommended by the 
World Health Organization [13], per capita spending would still be at levels 
far below that in the industrialized world. 

The lack of health care funding in low- and middle-income countries is 
exacerbated by rising health care needs and costs. Countries severely 
affected by HIV/AIDS are facing far greater demands for health care than 
they can meet. In other low- and middle-income countries, aging 
populations have increased overall demand for health care. Health care costs 
have risen as a result of new health care technologies and procedures. 
Moreover, many medicines that are routinely used in high-income countries 
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(so-called “essential medicines”) are not affordable in low-income countries. 
In some cases, even basic vaccines are too expensive. 

1.2.2 Current health care delivery challenges in high-income countries 

In high-income countries, resources for health care are orders of magnitude 
greater than in low-income countries. However, high-income countries face 
their own health care challenges. Although such countries spend much more 
on health than low-income countries, performance of health care systems 
varies markedly among high-income countries. For example, the United 
States spends almost twice as much per capita on annual health care as many 
other high-income nations, without achieving any greater life expectancy or 
any lower “burden of disease” (measured in terms of life years lived, 
adjusted for health disabilities) [14]. A recent report by the World Health 
Organization ranked the U.S. 37 th in overall health systems performance 
among 191 Member States [14]. France, which spends half as much as the 
U.S. on per capita annual health care, was ranked first in overall health 
systems performance [14]. (Health systems performance as measured in the 
report included not only measures of health, but also measures of health 
system fairness and responsiveness.) 

Inequities in health care provision exist within high-income countries. In 
countries with no national health system, such as the U.S., a significant 
fraction of individuals have no health insurance coverage and thus have only 
limited access to health care. Poor people and those in rural areas also often 
have only limited access to health care. In some high-income countries, 
including the U.S., the gap in life expectancy between rich and poor people 
is as great as the gap in life expectancy between high- and low-income 
countries [15]. 

Like low-income countries, high-income countries have experienced 
significant increases in demand for and cost of health care. Aging 
populations are making disproportionately heavy demands on health systems 
in high-income countries. Chronic conditions have become more prevalent. 
While new health care technologies and procedures have improved health, 
they have also increased costs. Patients are not only consuming more health 
services, but are consuming more intensive health services. Prescription 
drugs have also become increasingly expensive. In many high-income 
countries, health care spending has significantly outpaced economic growth. 
In the U.S., for example, health care spending accounted for 5% the Gross 
Domestic Product (GDP) in 1960 (or $143 per person); by 2001, health care 
spending accounted for 14.1% of the GDP (or $5,035 per person) [16]. As a 
result of these increases in demand for and cost of health care in high- 
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income countries, national health systems, insurers, and health care 
providers are all under strain. 

1.2.3 Future health care delivery challenges 

As we begin the 21 st century, many of the health care challenges described 
above will continue. Preventable diseases will persist. Inequities in access 
to health care within and across countries will persist. Health care costs will 
continue to increase, as will demands for health care. Advances in medical 
knowledge will continue, along with cosdy new technologies and medicines. 
Aging populations will consume increasing amounts of health care services. 
Patients will have increasing expectations for cures and treatments of more 
health problems. New means of delivering health care (e.g., telemedicine) 
will continue to emerge, creating a need for improved communication and 
information management systems. 

In both rich and poor nations, public resources for health care will remain 
inadequate to meet the demand. Policy makers and health care providers 
must determine how to provide the most effective health care to citizens 
using the limited resources that are available. Governments and health care 
providers must strive to meet basic health needs for all their citizens. 
Moreover, they must work to improve health and health-related quality of 
life for citizens in all stages of their lengthening life span. They must set 
health care priorities (e.g., between disease prevention and treatment or 
between alternate means of health care delivery) and develop health care 
systems that can deliver the needed health care in the most effective and 
efficient manner possible. Worldwide health improved dramatically during 
the 20 th century. The challenge of the 21 st century will be to continue this 
improvement. 

1.3 PROVIDING EFFECTIVE AND EFFICIENT HEALTH CARE 

To provide the best health care given the limited resources that are available, 
policy makers need effective methods for planning, prioritization, and 
decision making, as well as effective methods for management and 
improvement of health care systems. The planning and management 
decisions facing policy makers and planners can be grouped into two broad 
areas: health care planning and organizing, and health care delivery. 

1.3.1 Health care planning and organizing 

Health care planning and organizing involves relatively high-level policy 
decisions about the economics of health care systems (e.g., health care 
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resources, pricing, and financing), the structure of health care systems, and 
other aspects of public policy regarding health care. 

Economics of health care systems At the highest level of planning, 
governments and other health care providers must determine the level of 
resources they will devote to health care, and how much they will spend on 
individual patients. Governments must decide which goods and services are 
to be paid for through public funding and who will receive those goods and 
services. Because funds are not available to meet all health care needs, 
governments must set priorities and determine how they will ration the 
health services they pay for. Health care providers must determine the cost 
of services and set prices. Government agencies and other large insurers 
must negotiate prices for drugs and vaccines. Insurers, including 
governments, must determine who will receive health insurance coverage 
and what that coverage will consist of. They must develop affordable, 
workable payment schemes for physicians and other health care providers, 
and must determine what fees patients must pay for health care services. 
Such financing schemes must provide proper incentives for health system 
efficiency. 

Structure of health care systems Another set of high-level decisions 
concerns the structure and organization of health care delivery systems. 
Health care providers must determine which goods and services they will 
provide and how to allocate resources among them. Governments must 
decide to whom the goods and services will be provided. Resources must be 
allocated among different levels of the health service - for example, among 
primary care and public health programs versus hospital services. Resources 
must be allocated between capital development and operating costs, and 
between salary and nonsalary expenditures. Resources must be allocated 
among geographic areas - for example, different regions of a country, or 
urban versus rural areas. Resources must be allocated among specific 
programs - for example, programs for control of specific diseases, 
immunization programs, or reproductive health programs. Resources must 
also be allocated among specific health care goods and services - for 
example, doctor visits, procedures, or medications. 

Other public policy issues In addition to economic and structural issues, 
decision makers face a variety of other policy decisions that have a broad 
effect on the delivery of health care. Policy makers must develop strategic 
plans for national and regional health improvement. These include 
identifying risks to public health (e.g., environmental contaminants, 
infectious disease epidemics, or unhealthy lifestyles) and developing plans 
for mitigating such risks. Such plans may include, for example, national or 
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regional disease screening and prevention programs, health promotion 
programs, mass vaccination programs, programs to control biological pests 
(e.g., spraying against malaria-transmitting mosquitoes), programs for the 
control of illicit drugs, or programs for response to potential bioterrorist 
attacks. Policy makers must develop plans for the provision of health care 
that address the availability of and access to health care among those whom 
the health care system serves, with consideration given to the impact of 
insurance and regulatory policies on such access. Other population-level 
policy issues include policies for the allocation of transplant organs among 
potential recipients, for managing national blood supplies, and for managing 
national vaccine and pharmaceutical stockpiles. 

1.3.2 Health care delivery 

Planning and managing health care delivery involves decisions about the 
management of health care operations and about clinical practice. 

Operations management for health care delivery Operations management 
problems that arise in the delivery of health care are similar in many ways to 
traditional problems in operations management. These include strategic 
planning problems such as design of services (e.g., inclusion of neonatal 
intensive care units in some hospitals, or provision of free-standing urgent 
care clinics or rural health workers), design of the health care supply chain 
(e.g., design of a network of hospitals, outpatient clinics, and laboratory 
services), facility planning and design (e.g., location and layout of hospitals 
and outpatient clinics, or design of material handling systems), equipment 
evaluation and selection, process selection, and capacity planning. Other 
planning problems include demand and capacity forecasting, capacity 
management, scheduling and workforce planning, job design, and 
management of the health care supply chain. Managers of health care 
systems must manage inventory (e.g., drugs, supplies, or blood), measure 
and manage system performance and quality, and assess the performance of 
health care technologies. Decision support systems must be designed and 
implemented to support all of these activities. 

Clinical Practice Clinicians face a number of important planning and 
management problems in the delivery of health care. These include 
assessing health risks and diagnosing diseases and conditions of individual 
patients. Clinicians must design and plan treatment for their patients. For 
example, they must assess how disease is likely to progress in a patient and 
then they must select appropriate drugs and dosages and design other aspects 
of a treatment regimen (e.g., surgery, radiation, rehabilitation). Clinicians 
must determine appropriate disease prevention strategies for individual 
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patients (e.g., vaccination, disease screening, drug treatment, lifestyle 
changes). The goal of these clinical activities is to provide the highest 
quality care given the resources that are available. Doing so requires 
ongoing assessment of clinical quality and well as assessment of the cost and 
effectiveness of different health care interventions. A recent innovation in 
clinical practice has been the development of broad-based practice 
guidelines that specify the recommended standard of care for various 
diseases and conditions. Such guidelines are developed based on cost- 
effectiveness analysis of alternative interventions, and vary according to the 
population and setting (e.g., guidelines for treating a disease in a low-income 
country will differ from guidelines for treating the same disease in a high- 
income country). Finally, given the explosion of new medical knowledge, 
information management and decision support systems can play a crucial 
role in supporting effective and efficient clinical practice. 

1.4 OVERVIEW OF THIS BOOK 

Operations research techniques, tools, and theories have long been applied to 
a wide range of issues and problems in health care. However, to date, no 
single handbook has synthesized the wide applicability of such techniques 
and presented future challenges and avenues for research. In fact, 
practitioners, students, and researchers in this field have had difficulty 
finding a comprehensive reference that can help them improve their ability 
to apply such techniques, learn new techniques, explore new issues and 
challenges, and pursue new research avenues. This handbook aims to fill that 
need. 

This book covers applications of operations research in health care, with 
particular emphasis on health care delivery. The book is geared toward a 
multidisciplinary audience that includes OR practitioners, students, scientists 
and researchers with interest in health care (either new interest or existing 
expertise), as well as health practitioners (such as clinicians, administrators, 
and managers), students, scientists, and researchers in health sciences, health 
administration, public health, health care delivery, and health policy. 

Three main areas are covered: (1) health care operations management, (2) 
public policy and economic analysis, and (3) clinical applications. Within 
each area, a broad range of topics is addressed. Each chapter details a 
problem area, a state-of-the-art application, the methodology employed, and 
research issues raised. Each topic is structured and addressed in such a way 
that a wide audience - with varying levels of knowledge of the subject area 
or the methodology employed - will be able to access and use the material 
presented. 
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This book covers topics as diverse as hospital capacity planning and 
management, supply chain management for blood banking, evaluation of 
hospital efficiency, vaccine pricing policies, national drug control policy, 
decision making for bioterror preparedness, breast cancer diagnosis, optimal 
design of radiation treatments, and analysis of asthma treatments. Although 
they cover diverse topics, all of the chapters show how operations research 
can be applied to help make health care delivery more effective and 
efficient. 

1.4.1 Health care operations management 

The first main section of the book comprises chapters describing the 
application of OR models to problems in health care operations 
management. In Chapter 2, Linda Green describes how OR models have 
been and can be used for hospital capacity planning. In Chapter 3, Mark 
Daskin and Latoya Dean review the application of facility location models in 
health care. They also present a novel application of the classical set 
covering model to the analysis of cytological samples. In Chapter 4, Shane 
Henderson and Andrew Mason discuss the application of a customized 
simulation model to assist in decision making by a New Zealand ambulance 
service. In Chapter 5, William Pierskalla discusses the management of 
blood bank supply chains. In Chapter 6, Liam O'Neill and Franklin Dexter 
present a method to identify best practices among hospitals’ perioperative 
services using data envelopment analysis (DEA). In Chapter 7, Yasar 
Ozcan, Elizabeth Merwin, Kwangsoo Lee, and Joseph Morrissey describe 
the application of DEA to develop a methodology for analyzing 
organizational performance of community mental health centers. They also 
present measures of efficiency that can be used as a basis for improving 
productivity in behavioral health care. In Chapter 8, Michael Carter and 
John Blake describe four case studies of simulation applied to problems in 
hospital operations management. They describe the obstacles encountered in 
these applications, and the lessons learned. 

1.4.2 Public policy and economic analysis 

The second main section of the book comprises chapters that illustrate the 
application of OR to problems of health care policy and economic analysis. 
In Chapter 9, Rose Baker describes applications of conditional likelihood 
methods for estimating risks to public health. In Chapter 10, Thitima 
Kongnakom and Francis Sainfort describe how medical outcomes can be 
modeled in order to facilitate economic analysis of health care policy 
problems. In Chapter 11, Anke Richter presents three case studies of the 
application of OR techniques to evaluate the economic consequences and 
health benefits of new medications and treatments. In Chapter 12, Jonathan 
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Caulkins provides an overview of the ways in which OR models have been 
applied to evaluate policies for the control of illicit drugs. In Chapter 13, 
Gregory Zaric reviews recent OR advances in modeling maintenance 
treatment programs for opiate addicts. In Chapter 14, Harold Pollack 
describes how OR models have been used to evaluate syringe exchange 
programs and substance abuse treatment programs for injection drug users, 
and how such models can assist policy makers. In Chapter 15, Douglas 
Owens, Donna Edwards, John Cavallaro, and Ross Shachter apply a 
simulation model and economic analysis to evaluate the cost effectiveness of 
potential vaccines against HIV, the virus that causes AIDS. In Chapter 16, 
Sheldon Jacobson and Edward Sewell review the application of linear 
programming models to address a variety of economic issues surrounding 
pediatric vaccine formulary design and pricing. In Chapter 17, Margaret 
Brandeau reviews OR models that have been developed to assist in the 
allocation of resources to control infectious diseases. In Chapter 18, Stephen 
Chick, Sada Soorapanth, and James Koopman evaluate the public health 
benefits of two interventions for controlling infectious microbes in the water 
supply - improvements to centralized water treatment facilities, and 
localized point-of-use treatments in the homes of particularly susceptible 
individuals. In Chapter 19, Ruth Davies and Sally Brailsford present a 
model that evaluates policies for public health screening to detect diabetic 
retinopathy (which is early indications of eye disease caused by diabetes). 
In Chapter 20, Edward Kaplan and Lawrence Wein review the recent 
smallpox vaccination policy debate in the U.S., and describe the successful 
use of OR methods to influence policy in this arena. In Chapter 21, Stefanos 
Zenios reviews OR models that have been used to evaluate policies for 
allocating donor kidneys to transplant recipients. In Chapter 22, Mike 
Cushman and Jonathan Rosenhead describe the application of a model-based 
approach to the redesign of children’s health services in inner London. 

1.4.3 Clinical applications 

The third main section of the book comprises chapters that describe the 
application of OR techniques to clinical problems. In Chapter 23, Andrew 
Schaefer, Matthew Bailey, Steven Shechter, and Mark Roberts review the 
application of Markov decision process models to guide medical treatment 
decisions. In Chapter 24, Gordon Hazen describes how dynamic influence 
diagrams can be applied to model clinical decision problems. In Chapter 25, 
Elisabeth Pat6-Comell describes the application of risk analysis to evaluate 
policies for reducing risk during anesthesia procedures. In Chapter 26, 
David Paltiel, Karen Kuntz, Scott Weiss, and Anne Fuhlbrigge present a 
model that simulates health and economic outcomes among patients with 
asthma, and they illustrate the application of the model to assess the cost 
effectiveness of inhaled corticosterioids among certain adult patients. In 




12 



OPERATIONS RESEARCH AND HEALTH CARE 



Chapter 27, Daniel Rubin, Elizabeth Burnside, and Ross Shachter present a 
Bayesian network model that can help radiologists interpret mammograms 
and determine appropriate followup. In Chapter 28, Eva Lee and Marco 
Zaider describe an optimization model and decision support system to help 
plan radiation treatment for patients with cancer. In Chapter 29, Allen 
Holder describes linear optimization models that can be used to help design 
radiation treatments. In Chapter 30, Michael Ferris, Jinho Lim, and David 
Shepard describe the application ofMatlab for radiation treatment planning. 
In Chapter 31, James Koopman, Ximin Lin, Stephen Chick, and Janet 
Gilsdorf present a transmission model of a common bacteria that colonizes 
the human nose and throat, and they show how the model can be used to 
evaluate the relative effectiveness of different vaccines (in particular, 
vaccines that reduce transmission of the bacteria versus vaccines that prevent 
disease once a person’s throat has been colonized). Finally, in Chapter 32, 
David Craft, Lawrence Wein, and Dennis Selkoe present a model of the 
accumulation of amyloid, (J-protein (A P) in the brain during the course of 
treatment for Alzheimer’s disease, and show how the model can be used to 
determine appropriate treatments. 

1.4.4 Conclusion 

In a recent report [6], the World Health Organization stated that, “One of the 
most important roles of the World Health Organization is to assist countries 
in making optimum use of scarce health resources.” This, too, is a role for 
operations researchers, as this book demonstrates. 
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SUMMARY 

Faced with diminishing government subsidies, competition, and the 
increasing influence of managed care, hospitals are under enormous pressure 
to cut costs. In response to these pressures, many hospitals have made 
drastic changes including downsizing beds, cutting staff, and merging with 
other hospitals. These critical capacity decisions generally have been made 
without the help of OR model-based analyses, routinely used in other service 
industries, to determine their impact. Not surprisingly, this has often 
resulted in diminished patient access without any significant reductions in 
costs. Moreover, payers and patients are increasingly demanding improved 
clinical outcomes and service quality. These factors, combined with their 
complex dynamics, make hospitals an important and rich area for the 
development and use of OR/MS tools and frameworks to help identify 
capacity needs and ways to use existing capacity more efficiently and 
effectively. In this chapter we describe the general background and issues 
involved in hospital capacity planning, provide examples of how OR models 
can be used to provide important insights into operational strategies and 
practices, and identify opportunities and challenges for future research. 
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2.1 INTRODUCTION 

2.7.7 Background 

Hospitals are the locus of acute episodes of care for most serious illnesses 
and form the backbone of the emergency medical care system. Over the 
years, hospitals have been successful in using medical and technical 
innovations to deliver more effective clinical treatments while reducing 
patients’ time spent in the hospital. However, hospitals are typically rife 
with inefficiencies and delays. Patients spend hours and sometimes days in 
emergency rooms and recovery rooms waiting for beds. Procedures and 
surgeries have to be cancelled and rescheduled. Inpatients are placed in 
inappropriate beds and transferred multiple times from one unit to another. 
Nurses and other staff are often in short supply to handle peak loads. 

These inefficiencies have their roots in the regulatory and financing 
environment in which most hospitals existed until recently. Until the mid- 
1980’s, U.S. hospitals were paid by insurers on a “fee for service” basis and 
capacity expansions were subsidized by state governments. With the 
increased prevalence of managed care and reduced government subsidies, 
hospital managers have been under increasing pressure to cut costs and have 
undertaken large-scale changes to do so. Hospitals have been merged, 
downsized, and in many cases, closed. Beds have been reorganized, units 
closed, and patients discharged earlier to increase utilization and throughput. 
Emergency rooms are getting more crowded and there are increasing reports 
of ambulance diversions due to a lack of beds. Yet, most hospitals struggle 
to operate in the black. 

In this environment, it is more important than ever for hospital managers to 
identify ways to “right-size” their facilities and deploy their resources more 
effectively. Yet, hospitals do not generally use the kind of OR/MS 
methodologies used in many other service industries to help with capacity 
planning and management. 

2.7.2 Capacity planning in hospitals: oven'iew 

The most fundamental measure of hospital capacity is the number of inpatient 
beds. Hospital bed capacity decisions have traditionally been made based on 
target occupancy levels - the average percentage of occupied beds. Historically, 
the most commonly used occupancy target has been 85%. Certain nursing units 
in the hospital, such as intensive care units (ICUs) are often run at much higher 
utilization levels because of their high costs. 
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Until recently, the number of hospital beds was regulated in most states under the 
Certificate of Need (CON) process, under which hospitals could not be built or 
expanded without state review and approval. (In the last few years, most of these 
states have either relaxed or totally eliminated CON bed requirements.) Target 
occupancy levels were the major basis for these approvals. Though there has 
been fairly extensive literature on the use of queueing, simulation, and 
optimization models to support hospital planning [1-6], occupancy targets have 
been and continue to be the primary measure for determining bed requirements at 
the individual hospital and even hospital unit level. Faced with increased 
pressure to be more cost efficient, some hospitals are now setting target levels 
that exceed 90% without understanding and addressing the issues of bottlenecks 
and congestion in what is usually a highly stochastic, interdependent system. 

The other major component of capacity is personnel, particularly nurses. Nurses 
are the chief caregivers as well as managers of the clinical units. In recent 
studies, nursing has been found to have a significant impact on clinical outcomes 
[7]. In addition, nursing costs comprise a very substantial fraction of hospital 
budgets. In most hospitals, the number of nurses assigned to a unit is determined 
by a specified ratio of patients to nurses. The norm for most types of clinical 
units has been 8:1, while for intensive care units it could be as little as 1:1. 
Though most hospitals subscribe to these standards, cost pressures and a national 
nursing shortage have resulted in these ratios being exceeded in many cases. 
Sometimes, however, this is the result of a failure to adequately plan for the daily, 
weekly and sometimes seasonal variations in hospital census that are common in 
most clinical units of virtually every hospital. Though there have been many 
articles on the use of optimization models to determine nurse staffing (see 
references in [3, 8, 9]), hospitals often lack basic data, such as patient census by 
time of day, that would be needed to use such models [10]. 

Another significant component of capacity is operating rooms. Surgical 
procedures are usually a critical source of revenues for hospitals. The efficient 
use of operating rooms, which are often bottlenecks, can be central to the smooth 
functioning of the hospital as a whole. Substantial work on scheduling operating 
rooms has appeared in the OR literature (see references in [3, 11, 12]), though 
there is evidence that this resource is still a source of operational problems. 

Major diagnostic equipment, such as magnetic resonance imaging devices 
(MRIs), comprise another important category of capacity. These machines are 
extremely expensive, so operating policies are usually oriented toward achieving 
100% utilization. In order to avoid “excess” capacity and “unnecessary” usage, 
these purchases are regulated by the states under a certificate of need (CON) 
process. Hospital policies governing the use of MRIs are very varied. For 
example, in some hospitals, outpatients are scheduled on a dedicated facility 
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while in others, inpatients, outpatients and emergency patients all use the same 
machine. Policies and priority rules are constructed and implemented without 
any OR analysis and often result in long lead times for outpatient appointments 
as well as on-site delays. See [13] for a dynamic programming approach to the 
allocation of capacity for a shared facility. 

2.2 AN ILLUSTRATION OF THE ISSUES: EMERGENCY ROOM 
DELAYS 

2.2.1 Understanding the problem 

Newspapers, magazines and television have recently reported on severe 
overcrowding of emergency departments (EDs) and increases in the amount of 
time that ambulances are being turned away from hospitals [14-16]. Though 
troubling even on the surface, these reports are even more ominous given the 
current environment of terrorist threats. So what needs to be done to improve 
hospitals' ability to respond to emergencies? 

Before looking for solutions, it is critical to first understand the nature of the 
problem. This should begin with the question: “How long should patients 
wait?” Reports of excessive delays and overcrowding can be very misleading 
unless there is an understanding of what performance standards should be 
applied. This, in turn, necessitates an understanding of the potential medical 
consequences of specific delays for each category of patients. Many patients 
who arrive to an ED are “non-urgent” and would not be harmed by significant 
delays in seeing a physician. Most, however, are either “emergent” (requiring 
“immediate” care) or “urgent” (requiring care within a “short” period of time). 
Within each of these broad categories, however, there is considerable variety in 
the exact nature of the illness or injury and extremely little clinical evidence 
supporting specific delay standards. Unlike, say, telephone call centers, there are 
no industry-wide standards for what constitutes excessive delays in an ED. Nor 
are there generally accepted standards for how long a patient requiring admission 
from the ED should wait for a bed. It is this latter delay that directors of EDs 
generally cite as most responsible for ED overcrowding and ambulance 
diversions. 

2.2.2 Complexities of capacity planning 

Even without specific standards, there is clearly a problem when patients wait for 
the better part of the day for a bed, when filled stretchers block walkways and 
hallways, or when a hospital must routinely turn ambulances away. What causes 
these problems? Though one likely cause (and the one most widely cited in the 
media) is the reduction of inpatient beds over the last ten years, many other 
factors must be considered. From a capacity planning perspective, the entire 
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process from patient arrival in the ED to placement in a bed must be examined. 
Considering only the major steps, the process begins with the triage nurse, who 
determines the acuity of the patent’s condition, and registration which is usually a 
clerical function. Next, the patient is seen by an ED physician. Often this results 
in a request for diagnostic testing such as blood analysis and x-rays. Laboratory 
specimens are generally collected by technicians or nurses and sent to a central 
testing facility of the hospital. If the patient needs to be taken to another location 
in the hospital for a diagnostic test, transport personnel are needed. When all 
tests are completed, the physician reviews them and determines whether the 
patient requires admission to the hospital. If so, a bed is requested in the 
appropriate nursing unit (e.g., medical, surgical, intensive care). The availability 
of a bed is affected not only by the capacity of the relevant unit, but also by the 
admission and scheduling policies of elective patients, particularly surgical 
patients who compete for the same beds as many emergency patients [17], and by 
transfer and discharge policies and procedures. Even if a suitable bed is vacant, it 
must be located and identified as empty, and then cleaned, if necessary. In 
addition, a floor nurse must be available to admit the patient. When everything is 
ready, a request is made for transport and when it is available, the patient is 
finally moved to the assigned bed. Clearly, there is the potential for a mismatch 
between the demand and availability of capacity in each step of the process. 

This description of the ED admission process illustrates the complexities of 
hospital capacity planning and management. First, it demonstrates the 
interdependencies of the various parts of the hospital and the need to identify 
botdenecks. These bottlenecks may change from hour to hour, shift to shift, 
daily, weekly and seasonally. Second, it shows the variety of both fixed 
capacity (e.g., inpatient beds, ED beds, diagnostic equipment) and variable 
capacity (e.g., nurses, physicians, technicians, housekeepers, transport staff) 
that must be managed. Third, much of the capacity required for ED 
admissions - such as inpatient beds, labs, diagnostic equipment and transport 
staff - is shared by other patients in the hospital, and thus policies and 
procedures are required to allocate these resources among the various patient 
groupings. Fourth, ED admissions are generally time-dependent with 
distinct time-of-day and day-of-week patterns as well as some seasonality. 
Therefore, it is imperative that managers develop appropriately flexible 
staffing policies as well as strategies for using fixed capacity to handle peak 
loads efficiently and effectively. Finally, in order to create a true emergency 
response system, capacity needs must be considered on a regional basis and 
ambulance dispatch and diversion policies developed to assure timely access 
to care for the most urgent patients. Given that hospitals within the same 
geographic area are likely to experience many of the same peaks in demand, 
this means that enough regional capacity should be available so that the 
probability of all hospitals within a given area being on ambulance diversion 
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simultaneously is extremely small. This is well illustrated by the case of 
New York City which experienced a severe and protracted city wide shortage 
of inpatient hospital beds in 1987/1988 [18]. During this period, ambulances 
were routinely turned away from full hospitals and urgently sick patients 
experienced delays of days waiting for an open bed. 

2.3 HOW MANY HOSPITAL BEDS? 

2.3.1 The problem with occupancy levels 

As mentioned previously, hospitals often rely on target occupancy levels to 
plan and evaluate bed capacity. Until recent reports on ED overcrowding 
and increased ambulance diversion started surfacing, the widespread 
perception among policymakers and hospital managers was that there were 
too many hospital beds in the U.S. This belief was primarily supported by 
the discrepancy between what has usually been considered the “optimal” 
occupancy figure of 85% (see, e.g., [19], p.55) and the actual average 
occupancy rate for nonprofit hospitals which has recently been about 64% 
[20]. This and other related target occupancy levels were originally 
developed at the federal government level in the 1970’s as a response to 
accelerating health care costs and the perception that more hospital beds 
resulted in greater demand for hospital care (which was shown to occur 
under fee for service reimbursement). These occupancy targets were the 
result of analytical modeling for “typical” hospitals in various size categories 
and were based on estimates of “acceptable” delays [21]. 

What is wrong with using occupancy levels to manage capacity? First, 
reported occupancy levels are generally based on the average “midnight 
census”. This refers to the time when hospitals count patients for billing 
purposes. However, the midnight census usually measures the lowest 
occupancy level of the day. One reason is the phenomenon known as the 
“23-hour patient” who is admitted in the morning and discharged in the 
evening. Managed care companies have encouraged this practice as a way 
of allowing evaluation of a patient while avoiding unnecessary 
hospitalization. More generally, most patients are admitted in the morning 
or early afternoon and are not discharged until after attending physicians 
have conducted examinations, so that the peak census is in the middle of the 
day and can easily be 20% higher than at midnight [22]. In addition, the 
utilization of hospital facilities is far from uniform across the week or across 
the year. Very few procedures are scheduled for weekends, so elective 
patients are not usually admitted on weekends when the average daily census 
is considerably lower. Summer and holiday periods are also slower [23] and 
other seasonal effects have been observed in specific hospitals and/or for 
specific units. Reported occupancy levels are yearly averages and hence do 
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not reflect significantly higher levels that may exist for extensive periods of 
time. For all of these reasons, reported occupancy levels are not reliable 
measures of general bed utilization. 

More importantly, bed occupancy levels do not measure or even indicate 
patients' delays for beds. Yet, hospitals do not typically measure bed delays 
nor do they use queueing or simulation models to estimate the delays that 
would result from changes in demand or the number or organization of beds. 

2.3.2 Target occupancy levels, bed delays and size 

Evaluating bed capacity based on a target probability of bed availability or other 
measure of delay can lead to very different conclusions than would be reached 
from the use of a target occupancy level. This can be illustrated in considering 
obstetrics units. Obstetrics is generally operated independently of other services, 
so its capacity needs can be determined without regard to other parts of the 
hospital. It is also one for which the use of a standard M/M/s queueing model is 
quite good. Most obstetrics patients are unscheduled and the assumption of 
Poisson arrivals has been shown to be a good one in studies of unscheduled 
hospital admissions [24]. In addition, the coefficient of variation (CV) of length 
of stay (LOS), which is defined as the ratio of the standard deviation to the mean, 
is typically very close to 1.0 [6] satisfying the service time assumption of the 
M/M/s model. 

Since obstetrics patients are considered emergent, the American College of 
Obstetrics and Gynecology (ACOG) recommends that occupancy levels of 
obstetrics units not exceed 15% [25]. Many hospitals have obstetrics units 
operating below this level. For example, based on the 1997 Institutional 
Cost Reports (ICRs), 117 of the 148 or 79% of New York State hospitals 
had average occupancy levels below this standard. Some have eliminated 
beds to reduce “excess” capacity and costs [26]. Conversely, fewer than 
20% of these hospitals had obstetrics units that would be considered over- 
utilized by this standard. 

But evaluation of capacity based on a delay target leads to a very different 
conclusion. Though there is no standard delay target, Schneider [27] 
suggested that the probability of delay for an obstetrics bed should not 
exceed 1%. Applying this criterion and using the ICR data in an M/M/s 
model results in 40% of the hospitals having insufficient capacity by this 
standard. The major reason for this is size. From queueing theory, we know 
that larger service systems can operate at higher utilization levels than 
smaller ones while attaining the same level of delays [28]. While obstetrics 
units are usually not the smallest units in the hospital, there are many small 




CAPACITY PLANNING AND MANAGEMENT IN HOSPITALS 



23 



hospitals, particularly in rural areas, and the units in these may only contain 
5 to 10 beds. Of the New York State hospitals represented in this data, more 
than 50% had maternity units with 25 or fewer beds. How large would an 
obstetrics unit need to be to operate at a 75% occupancy level and have a 
probability of delay not exceeding 1%? The estimate based on the M/M/s 
model is that at least 67 beds are needed. Only 3 of the 148, or 2% of the 
New York hospitals represented in the 1997 ICR reports had units at least 
this large. 

2.3.3 The impact of seasonal ity 

The above discussion illustrates that policies based on target occupancy 
levels can result in less than desirable access to beds. Indeed, actual results 
are likely to be worse than described above. This is because the above 
analyses were based on average annual occupancy levels and obstetrics units 
typically experience a significant degree of seasonality in admissions. For 
example, data from Beth Israel Deaconess Hospital in Boston [6] revealed 
that the average occupancy levels varied from a low of about 68% in January 
to about 88% in July. With 56 beds, the probability of delay for an 
obstetrics bed, as estimated from the M/M/s model, for a patient giving birth 
in January is likely to be negligible, while in July, it would be about 25%. 
And if, as is likely, there are several days when actual arrivals exceed this 
latter monthly average by say 10%, this delay probability would shoot up to 
over 65%. The result of such substantial delays can vary from backups into 
the labor rooms and patients on stretchers in the hallways to the early 
discharge of patients. Clearly, hospitals need to plan for this type of 
predictable demand increase by keeping extra bed capacity that can be used 
during peak times, or by using “swing” beds that can be shared by clinical 
units that have countercyclical demand patterns. 

2.3.4 The impact of clinical organization 

Hospital beds are not all the same. In most general care hospitals, beds are 
organized into nursing units. A nursing unit generally corresponds to a specific 
physical location with a dedicated nursing staff headed by a general nurse 
manager. Each nursing unit is used for one or more clinical services, such as 
medicine, surgery, cardiology, neurology, and so forth. With the exception of a 
few services such as pediatrics, obstetrics and psychiatry, which are always 
operated as dedicated units, hospitals vary in the number and types of nursing 
units. For example, in some hospitals, nursing units may house both general 
medical and surgical patients, while others operate stricdy dedicated units for 
each. In addition, hospitals generally have one or more intensive care units 
(ICUs). Some hospitals have many specific types of ICUs including 
neurological, surgical, medical and cardiac. One of the distinctive features of 
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ICUs is that all beds have telemetry so that vital functions can be continually 
monitored. However, other hospital beds may have telemetry as well and some 
patients who do not require care in an ICU may nevertheless require a telemetry 
bed. 

Hospital managers are often aware that higher occupancy levels can be 
achieved if beds are used more flexibly. Hence some have engaged in 
efforts to cross-train nurses and/or invest in more telemetry in order to treat a 
greater variety of patient types within a single unit. In addition, small 
clinical services are often combined with other services because of physical 
constraints and overhead considerations. For example, cardiac and thoracic 
surgery patients are often treated in a single unit since thoracic patients are 
relatively few and require similar nursing skills as cardiac patients. From a 
strictly operational point of view, is it always beneficial to combine clinical 
services? What factors need to be considered in evaluating alternative 
clinical organizations? 

As an example, consider the cardiac and thoracic surgery unit of Beth Israel 
Deaconess Hospital in Boston. Based on data collected for three years, the 
average arrival rate of cardiac patients in Beth Israel was 1.91 bed requests 
per day versus .42 for thoracic patients. Since no information was available 
on the pattern of admissions to these services, we assumed Poisson arrivals. 
Since most surgical patients are elective, this assumption could result in an 
overestimate of delays. However, as described in [6], other factors are likely 
to more than compensate for this. The CV of LOS was sufficiently close to 
one so that an M/M/s model produces estimates that are sufficiently reliable 
for examining the relative performance of alternative policies. 

Table 2.1a shows the number of beds required to meet several performance 
targets by each of the two services operating independently as well as in a 
combined unit. Delay in this context usually measures the time a patient 
coming out of surgery spends waiting in a recovery unit or intensive care 
unit until a bed in the surgical unit is available. Long delays are problematic 
since they cause backups in the operating room and emergency room and 
can result in surgeries being cancelled and the hospital going on ambulance 
diversion. Table 2.1a shows that for each delay target, the combined unit 
results in a savings of one bed out of a total of about 20 beds. 

However, this assumes that the admissions policy is the same for all patients. 
In Beth Israel Deaconess, as in other hospitals, cardiac patients have priority 
over thoracic patients. Table 2.1b shows the results of using a non- 
preemptive priority queueing model to estimate delays for both patient types 
[29]. Focusing on Beth Israel’s target of expected delay of less than one 
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day, we see again that 21 beds is the minimum that produces this result. 
However, the resulting expected delay for the low priority thoracic patients 
is now more than three days. This long delay is due to the fact that thoracic 
patients represent less than 20% of the total arrivals and thus will often be 
bumped in queue by the far more prevalent cardiac patients. Even worse, this 
predicted expected delay for thoracic patients of 3.2 days is actually an 
underestimate. This is because the model assumes the same (weighted) 
average service time for both customer classes while in reality, the higher 
priority cardiac patients have an average LOS of 7.7 days versus 3.8 for 
thoracic patients resulting in even longer delays than predicted for the 
thoracic patients. If an additional bed is added, the resulting delay for 
thoracic patients goes down to 1.5 days, a more reasonable level, but there 
will be no savings over operating the units separately. And to maintain a 
maximum expected delay of one day for each patient group, the combined 
unit would actually require one more bed than the separate units. 

Table 2.1 Cardiac and thoracic surgery utilization and delays 



A. Number of beds needed to achieve expected delay (E(D]> service targets 



Target 


Cardiac 


Thoracic 


Combined 




No. 

Beds 


Util- 

ization 


No. 

Beds 


Util- 

ization 


No. 

Beds 


Util- 

ization 


.5 


19 


.84 


4 


.40 


22 


.81 


1 


19 


.84 


3 


.53 


21 


.85 


2 


18 


.88 


3 


.53 


20 


.89 


3 


18 


.88 


3 


.53 


20 


.89 



B. Delays when priority given to cardiac patients 



E[D] (Days) 


Number of 
Beds 


Cardiac 


Thoracic 


Overall 


Utilization 


23 


0.17 


0.77 


0.28 


0.78 


22 


0.28 


1.53 


0.5 


0.81 


21 


0.47 


3.2 


0.96 


0.85 


20 


0.77 


7.49 


1.98 


0.89 



Therefore, the “increased efficiency” in terms of reduced beds (and thus 
higher occupancy level) is at best small and may actually be nonexistent. Of 
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course, a unit of just three beds is likely to be inefficient from a physical 
space and overhead perspective. Therefore, it might be beneficial to operate 
the two services in one unit but employ a policy, such as a dynamic priority 
scheme, that would better balance the delays experienced by the two patient 
types. As a simple example, an admissions policy could give priority to 
cardiac patients as long as no thoracic patient has been waiting for T days. 
As soon as this threshold is reached, the policy reverts to first-come, first- 
served. 

Another factor that needs to be considered in evaluating the benefits of a 
nursing unit with several clinical services is the degree of disparity in the 
LOS profile of the patients. Smith and Whitt [30J give examples of how 
combining customers who have different average service times can increase 
the variance of the service time in the combined queue and result in longer 
average delays. It is also possible that the average LOS could increase for 
one or more patient groups due to the reduced expertise that comes with a 
more generally trained staff. 

2.3.5 The seven-day hospital? 

In most hospitals, elective procedures and diagnostic testing come to a virtual 
stop on weekends. As a result, average bed occupancy levels are considerably 
lower and heavily demanded equipment such as MRIs are idle. Pressures to 
increase patient throughput are causing hospitals to think about the potential 
benefits of a “seven-day hospital". On the cost side, scheduling elective 
procedures and tests on weekends would require additional staffing, perhaps at 
overtime rates in some cases. What might be gained? 

To illustrate the possible impact of a seven-day hospital on capacity needs, 
consider the case of a surgical intensive care unit (SICU). Most patients in an 
SICU are elective and therefore admissions drop significantly on the weekend. 
The data from one such unit, shown in Table 2.2, illustrate a typical pattern, with 
the average admission rate peaking at 4.42 patients per day on Tuesday and 
dropping to only 1.44 patients on Sunday. Given this demand profile and an 
average LOS of 3.05 days, how many beds are needed in this unit? 

Using numerical integration to solve the differential equations that describe this 
nonstationary queueing process, we find that 17 beds are needed to assure that 
the daily probability of delay is below 10%. Now assume that the same number 
of admissions is smoothed over the entire seven-day week. Using the average 
daily arrival rate of 3.34 patients in an M/M/s model indicates that only 15 beds 
are now needed to meet this target performance. What if 15 beds are used but the 
demand is not smoothed over the week? Then the nonstationary model indicates 
that while the average probability of delay over the week would be about 11%, 
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Table 2.2 Surgical intensive care - Admissions 



Day 


Admissions/Day 


Sunday 


1.44 


Monday 


3.36 


Tuesday 


4.42 


Wednesday 


3.59 


Thursday 


3.92 


Friday 


4.40 


Saturday 


2.21 


Average 


3.34 



the daily probability of delay would peak on Fridays at about 18% with an 
expected delay of over 13 hours for those who are delayed [6]. The result of this 
might be a backup of patients in the surgical recovery room which could result in 
the cancellation of some surgeries scheduled for the end of the week. The 
“optimal” capacity and operating policy could be determined by weighing the 
expected revenue loss against the alternatives of expanded bed capacity and the 
additional staffing costs associated with conducting a regular surgical schedule 
on weekends 

2.4 STAFFING THE ED: HOW SHOULD LEVELS VARY ACROSS 
THE DAY? 

2.4.1 Overview 

Visits to emergency departments (EDs) have been increasing while the 
number of emergency departments has been decreasing. This has put a 
significant strain on the directors of emergency departments to keep patient 
delays in receiving treatment reasonable. The most critical resource for 
controlling delays is the physician staff. However, unlike hospital beds, the 
number of available physicians can be adjusted to accommodate varying 
arrival volumes. 

Hospital managers are aware that arrivals to EDs are very variable with 
time-of-day, day-of-week and even seasonal patterns. Under federal law, 
emergency rooms are required to allow all patients access to care 24 hours a 
day, regardless of ability to pay. Therefore, people who lack health 
insurance (currently more than 44 million in the U.S.), as well as others who 
may have difficulty gaining access to primary care physicians, use hospital 
emergency rooms as their sole source of treatment. 
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Matching physician capacity to patient needs is critical to the ED’s ability to 
provide timely care to urgently ill or injured patients. Given the substantial 
variability and unpredictability of demand, as well as the diversity of 
patients and their medical needs, determining physician staffing levels is 
very challenging. Yet, as in other areas of the hospital, decisions are not 
generally based on the use of OR models. 

2.4.2 Using queueing models to determine physician staffing: an example 

Figure 2.1 illustrates the arrival pattern for the busiest weekday of an ED in 
a mid-sized urban medical center, which shows a low of about .9 arrivals per 
hour in the middle of the night to over 5 per hour in the middle of the day. 
Also shown are physician staffing levels over the day based on the judgment 
of the ED directors. No explicit data was kept on the duration of physician 
examination times, and though no data was kept on patients’ delays before 
seeing a physician, delays were observed to be very long, particularly during 
the late afternoon and evening hours. This resulted in a high rate of 
“walkouts” - patients who leave after registering but before being seen by a 
physician - a matter of great concern to the ED directors as well as senior 
management. 

At the time of this study, a request for additional physician hours was under 
consideration by senior hospital officials. To determine the appropriateness 
of using queueing models to guide the allocation of any additional staffing, 
current performance was estimated by using the empirical demand data, the 
mean physician exam time (estimated to be 45 minutes) and the staffing 
levels shown in Figure 2.1, and solving the differential equations that 
describe the time-varying behavior of the system based on Poisson arrivals 
and exponential service times [31]. In order to represent the true workload 
in the system, the realized demand for physicians was derived from the 
arrival data shown in Figure 2.1 by adjusting for walkouts. The walkout rate 
was about 14.1% over the day. Based on a survey of U.S. ED directors [32] 
and discussions with ED managers, we adopted as our primary performance 
measure the probability of delay exceeding one hour, or Pr(D > 1). Figure 

2.2 shows the time-varying behavior of this performance measure resulting 
from the staffing pattern shown in Figure 2.1 (see [33] for the derivation of 
this calculation). The results, showing Pr(D >1) ranging from a low of .25 
at 5 a.m. to a high of .87 at 1 1a.m., were considered by the ED managers as 
consistent with empirical observations. 

To help identify the number and scheduling of ED physicians that would 
yield more acceptable performance, we used a target of Pr (D > 1) < .10. 





Figure 2.2 Actual staffing levels and estimated 
Pr (Delay > 1 hour) 




staffing level 
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Traditionally, in a service system with time-varying arrivals, the desired 
staffing levels would be determined by the stationary independent period by 
period or SIPP approach which begins by dividing the workday into 
planning periods, such as shifts, hours, half-hours, or quarter-hours. Then a 
series of stationary queueing models, most often M/M/s type models, are 
constructed, one for each planning period. Each of these period-specific 
models is independently solved for the minimum number of servers 
needed to meet the service target in that period. In a similar vein, 
Vassilacopoulos [34] used a dynamic programming model to determine 
physician staffing levels in an ED assuming that the allocation in each hour 
should be proportional to the corresponding arrival rate for that hour. In 
[31], the SIPP approach was shown in many cases to seriously underestimate 
the number of servers needed to meet a given delay performance target. 
This is particularly true when the mean service times are high (e.g., 30 
minutes or more) and planning periods are long (two hours or more). In 
these situations, it was demonstrated that a simple variant of SIPP, called 
Lag SIPP, performs far better than the simple SIPP approach. The major 
reason is that in cyclical demand systems, there is a time lag between the 
peak in the arrival rate and the peak in system congestion. This lag is 
significant when the mean service time is long. Lag SIPP corrects for this 
factor. 

We used both the SIPP and Lag SIPP approaches with the unadjusted 
empirical arrival data to compare the current staffing levels with the staffing 
levels the models suggest would be needed to serve the total demand at the 
targeted level of performance. As expected, both the SIPP and Lag SIPP 
approaches indicated that current staffing of 55 hours per day would need to 
increase substantially, by about 63% to meet this target. Though both the 
SIPP and Lag SIPP methods suggested a total of 90 physician hours per day, 
the staffing pattern suggested by the SIPP approach resulted in Pr(D > 1) 
exceeding the target of .10 by more than 10% for 4 hours of the day and 
attaining a maximum of .22 for one 2-hour period. In contrast, the Lag SIPP 
method yielded staffing estimates that met the target delay in every period. 
Figure 2.3 shows the Lag SIPP proposed staffing levels as well as the 
predicted Pr (D>1) curve. 

Though the hospital was not in a position to hire this many new physicians, 
the ED director was interested in the staffing pattern suggested by Figure 
2.3. One important insight was that the changes in staffing levels generally 
lag the changes in the arrival rate by one planning period. 

The Lag SIPP model was also used to explore other alternatives. First, the 
performance target was relaxed so that Pr(D > 1) < .2. In this case, the Lag 
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Figure 2.3 LAG SIPP Staffing, Pr (Delay> 1 hour) < .10 




Figure 2.4 LAG SIPP staffing, Pr (Delay > 1 hour) < .2 
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SIPP results indicated that the staffing would need to increase by about 50% 
to 82. (Interestingly, the SIPP model suggested a total of 84 physician-hours 
for this case.) This was still considered too expensive. However, Lag SIPP 
does not necessarily result in an optimal allocation and looking at the 
predicted resulting curve for Pr(D>l) shown in Figure 2.4, we noticed that 
this probability dips significantly between 7 a.m. and 2 p.m. Therefore, we 
postulated that we could reduce the staffing by one physician in each of the 
2-hour periods starting at 8 a.m. The result, shown in Figure 2.5, shows that 
the delay target is still never exceeded by more than 10% in any 2-hour 
period. This pattern was used by the ED directors as a guide to reallocating 
their current physician staff over the day. 

To refine the model’s recommendations, it would have been helpful to 
consider priority classes since it is most important that the emergent and 
urgent patients be seen by a physician within a given time window. 
However, no reliable data was kept on arrivals by priority class and the 
hospital had no immediate plans to do so. 

2.4.3 Transport staffing: another potential source of delays 

Though a lack of appropriate inpatient beds is usually cited as the major 
reason for ED overcrowding, patients often experience delays even when 
beds are available. In fact, as illustrated in Figure 2.6, which shows 
ambulance diversions by time of day for all hospitals in Manhattan from 
1999 through 2001, one of the two most frequent times for ED 
overcrowding and hence diversions is from midnight to 2 a.m. However, 
this is the time period when hospital occupancy levels are lowest. 

One reason for this seeming anomaly was identified in one large New York 
hospital where a data collection effort showed that the time between bed 
assignment and the patient leaving the ED peaked from an average of 2. 1 
hours to between 3 and 4 hours during the midnight to 4 a.m. time interval. 
Further analysis revealed three reasons for this. First, the demand for 
transports peaked to about 8 patients per hour starting at midnight from a 
daytime average of about 7. This counterintuitive finding was due to the 
combination of peak arrival rates that started at about noon and stayed high 
until early evening, and an average duration of 8.2 hours between arrival 
time and bed assignment. However, because ED arrival rates drop to near 
their lowest during this time, hospital managers had decided that transport 
staff should be reduced starting at midnight from two to one. In addition, it 
was found that while the average transport during daytime hours was about 
20 minutes, this increased to 27 minutes starting at midnight. This was 
attributed to the fact that during the day, ED transport personnel were used 
for transporting patients to diagnostic facilities located within the ED as well 
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Figure 2.5 Modified lag SIPP staffing, Pr (Delay > 1 hour) 
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as to inpatient beds; while at night, when these facilities are closed, 
personnel were used only for transporting patients to beds. As a result of a 
queueing analysis that incorporated these factors, the hospital added a 
transporter during the midnight to 4 a.m. period with a subsequent average 
decrease of over an hour in transport delays. 

In addition to transport personnel, most hospitals reduce other support staff 
at midnight. Many of these, such as nurses, who are needed to physically 
receive patients on the floors, housekeepers, who must make sure beds are 
prepared, and other personnel who are responsible for locating beds, impact 
ED delays. The above demonstrates the need to properly identify and 
analyze the impact of time- varying effects of both demands and processing 
times throughout the hospital in order to alleviate ED overcrowding 

2.5 FUTURE RESEARCH OPPORTUNITIES AND CHALLENGES 

2. 5. 1 Creating flexibility 

As indicated in the examples above, patients often experience serious delays 
due to highly variable patient demands and capacity constraints. Yet, 
hospitals are often reluctant or unable to add capacity because of cost 
pressures, regulatory constraints, or a shortage of appropriate personnel. 
This makes it extremely important to use existing capacity most efficiently. 
Increasing bed flexibility can be a key strategy in alleviating congestion. 
However, no comprehensive analysis has evaluated alternatives or identified 
good policies regarding bed flexibility. Two approaches that have been used 
in some hospitals are worthy of comprehensive analysis. 

As noted before, the degree to which inpatient beds are segregated into 
nursing units dedicated to one or more clinical services varies across 
hospitals. From a medical perspective, there may be benefits derived from 
having patients clustered by diagnostic categories in dedicated units 
managed and staffed by specialized nurses. These include shorter LOS, 
fewer adverse events and fewer readmits. Yet, many hospital managers 
believe that nurses can be successfully cross-trained and that increasing bed 
flexibility is ultimately in the best interests of patients by increasing speedy 
access to beds and minimizing the number of bed transfers. By 
incorporating waiting times, percentage of “off-placements” and the effects 
on LOS, OR models can be used to address some important research 
questions dealing with these tradeoffs including: 

1. For a given predicted set of demands and a fixed number of nursing 
units of a given size, how should clinical services be clustered into 
nursing units? 




CAPACITY PLANNING AND MANAGEMENT IN HOSPITALS 



35 



a. What is the minimum amount of flexibility needed to assure 
timely access to beds? Can this best be achieved by assigning each 
clinical service to only one nursing unit, or by allowing some 
diagnostic categories to be served in multiple units? 

b. Which services should be consolidated into a common unit? How 
should this be affected by LOS characteristics? By nursing 
requirements? By other resource requirements? 

2. For a given nursing unit configuration, what is an optimal real-time bed 
allocation policy? For example, in the event that there is no appropriate 
bed available when needed by a new patient, should the patient be 
placed in another available bed or wait (e.g. in the emergency room or 
recovery room) until the “right” bed becomes available? 

3. When services share a common nursing unit, what admissions policy 
should be used if there are differing levels of urgency associated with 
different patient types? For example, in the case of the cardiac and 
thoracic surgery unit described previously, what type of dynamic 
priority rule should be used to assure an appropriate level of bed 
availability for both patient types? 

Another approach for increasing bed flexibility is the use of “overflow” units 
or “swing” beds. These often exist in hospitals that have downsized by 
closing units without converting them to another use. This results in beds 
that are not normally staffed but may be used when bed demand increases 
substantially. A related strategy is to use units that generally have more 
predictable demand and lower occupancy levels to serve as overflow units 
for those that frequently fill up. These practices raise several important 
planning and policy issues such as the following: 

1. Given the associated fixed and variable costs, what are the optimal 
policies for opening and shutting a normally unused overflow unit? 

2. How many swing beds should a hospital have and for which clinical 
services? 

3. How should clinical units be used to “back up” each other so as to 
minimize overall off-service placements without jeopardizing the 
timely provision of care? 

The above strategies increase “horizontal” bed flexibility. Some hospitals 
have increased “vertical” bed flexibility by reducing the number of different 
areas in which certain categories of patients reside during their stay. For 
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example, the traditional patient flow model for maternity patients is to move 
from a labor room to a delivery room to a recovery room and then, finally, to 
a postpartum bed. Similarly, critically ill patients may spend time in an ICU 
followed by a “step-down” unit and finally a non-monitored inpatient bed 
before being discharged. Yet some maternity units have combined 
labor/delivery/recovery rooms, and some hospitals do not have “step-down” 
units. OR-based analyses could help shed light on which of these 
alternatives is more attractive and under what conditions. 

2.5.2 Allocating capacity among competing patient groups 

Many hospitals provide service to three distinct categories of patients: 
inpatients, outpatients and emergency patients. These patient groups have 
differing medical, financial and service requirement profiles, but often 
require the same set of resources including laboratories, imaging facilities 
and operating rooms. One important example is magnetic resonance 
imaging machines (MRIs). A hospital MRI or “magnet” is a very expensive 
piece of equipment and is critical in diagnosing a broad variety of illnesses, 
each of which may require a unique examination protocol and duration. For 
these reasons, utilization of MRIs tends to very heavy and unpredictable and, 
consequently, significant delays are common. Delays are compounded by 
late arrivals, cancellations and “no-shows. Operating rooms have very 
similar characteristics. 

Research on operational policies for these types of shared resources could be 
very useful in increasing their efficiency and service performance. 
Important questions include: 

1. How should outpatient (or elective patient) schedules be designed so as 
to allow for timely access by emergency patients and/or inpatients 
without resulting in unacceptable backups? 

2. Given the costs of delay for each patient type, what dynamic priority 
rules are optimal for allocating time slots during the day when more 
than one type of patient is waiting? (See [13] for some work on this 
issue.) 

3. Assuming that the likelihood of cancellations and “no-shows” increases 
with the duration of time between when an appointment is made and 
the scheduled examination date, what is the optimal length of the 
scheduling horizon? 

4. When a hospital has multiple diagnostic or treatment facilities, how 
many and which patient categories should be assigned to each? 
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2.5.3 Regional capacity planning 

The merger activity of the 1990’s has resulted in networks of hospitals 
within certain geographical regions that have various sorts of contractual 
commitments to coordinate their planning and activities to some degree. 
Though these types of associations are often formed primarily to enhance 
hospitals’ bargaining power with payers and suppliers, in some cases an 
important goal is to streamline and improve the delivery of health care. One 
possible means of increasing operational efficiency is through clinical 
consolidation or “regionalization” of one or more clinical services. In other 
words, it could be advantageous to offer a particular clinical service in a 
single location. One example of a service that has been considered for such 
treatment is obstetrics. As discussed above, most obstetrics patients require 
quick access to beds and most obstetrics units are relatively small. The 
result is that average occupancy levels must be quite low to provide timely 
provision of beds. Consolidating obstetrics units across two or more 
hospitals in a region would clearly result in bed savings and likely result in 
greater administrative and staffing efficiencies. Other candidates for 
regionalization are clinical services with small patient demands or those that 
involve unique technologies and/or skills such as bum units. However, in 
assessing the desirability of any clinical regionalization, patient travel 
distances and times must be considered. OR-based analyses could be very 
helpful in identifying candidate services for regionalization and in 
determining which hospitals in a given geographic region might be best able 
to provide a given clinical service. 

Another dimension of regional planning is emergency responsiveness. 
Increasingly, hospitals are coordinating efforts to communicate and respond 
to unanticipated spikes in demand for emergency department services and 
inpatient capacity. This has become more of a priority since the events of 
September 1 1 th , 2001, and the resulting increased concern with preparedness 
in the event of terrorist attacks. Initial efforts have focused on developing 
better communications and information systems to collect and disseminate 
relevant information quickly among hospitals and public agencies. Little 
attention has been given to identifying which hospitals, clinical units and 
resources might be vulnerable given sudden, unanticipated surges in demand 
within and across a given region. (See [26] for some initial work on this 
issue.) More fundamentally, there is no widely accepted definition of 
emergency room overcrowding nor agreement on hospital policies for 
ambulance diversion. Emergency planning is a complex, multi-dimensional 
issue involving a high degree of unpredictability. The following questions 
illustrate some broad areas of potential research: 
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1. How should hospital planning regions be defined? Should this 
definition differ by clinical service? 

2. When should a hospital go on ambulance diversion? How should this 
be affected by conditions at the other hospitals in the region? 

3. How should a hospital’s “surge capacity” (the percentage increase in 
demand above normal levels that can be “adequately” accommodated), 
be defined and predicted? 

2.5.4 Conclusion 

Hospital managers are increasingly aware of the need to use their resources 
as efficiently as possible in order to continue to assure that their institutions 
survive and prosper. As this chapter has attempted to demonstrate, effective 
capacity management is critical to this objective as well as to improving 
patients’ ability to receive the most appropriate care in a timely fashion. 
Yet, effective capacity management must deal with complexities such as 
tradeoffs between bed flexibility and quality of care, demands from 
competing sources and types of patients, time-varying demands, and the 
often differing perspectives of administrators, physicians, nurses and 
patients. All of these are chronic and pervasive challenges affecting the 
ability of hospital managers to control the cost and improve the quality of 
healthcare delivery. 

From an analytical perspective, these capacity management issues involve 
complex dynamics that will require the development of new optimization, 
queueing and simulation models in order to gain insights to guide strategies 
and decisions. However, a major obstacle to developing and applying these 
much needed models is a lack of relevant operational data. Hopefully, as 
management information systems continue to be developed and enhanced, 
hospitals will prove to be an extremely rich area for using OR/MS models to 
improve the quality of healthcare delivery and, perhaps, ultimately save lives 
as well as money. 
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SUMMARY 

This chapter reviews the location set covering model, maximal covering 
model and P-median model. These models form the heart of the models 
used in location planning in health care. The health care and related location 
literature is then classified into one of three broad areas: accessibility 

models, adaptability models and availability models. Each class is reviewed 
and selected formulations are presented. A novel application of the set 
covering model to the analysis of cytological samples is then discussed. The 
chapter concludes with directions for future work. 
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3.1 INTRODUCTION 

The location of facilities is critical in both industry and in health care. In 
industry, poorly located facilities or the use of too many or too few facilities 
will result in increased expenses and/or degraded customer service. If too 
many facilities are deployed, capital costs and inventory carrying costs are 
likely to exceed the desirable value. If too few facilities are used, customer 
service can be severely degraded. Even if the correct number of facilities is 
used, poorly sited facilities will result in unnecessarily poor customer 
service. 

In health care, the implications of poor location decisions extend well 
beyond cost and customer service considerations. If too few facilities are 
utilized and/or if they are not located well, increases in mortality (death) and 
morbidity (disease) can result. Thus, facility location modeling takes on an 
even greater importance when applied to the siting of health care facilities. 

This chapter begins with a review of three basic facility location models 
from which most other models are derived: the set covering model, the 
maximal covering model, and the P-median model. Next, we discuss three 
major focal points of the location literature as it applies to health care 
facilities: accessibility, adaptability and availability. In the course of doing 
so, we review selected models and applications that have appeared in the 
literature. Our purpose is not to provide a comprehensive survey; rather our 
goal is to give the reader a feel for the models that have been proposed and 
the problems to which they have been applied. The reader interested in a 
more general introduction to facility location modeling should consult [1-4]. 
More recently Marianov and ReVelle [5] reviewed emergency siting models, 
Current, Daskin and Schilling [6] summarized general location models, 
Marianov and Serra [7] discussed the application of facility location models 
to problems in the public sector and Berman and Krass [8] summarized the 
state of the art in modeling problems with uncertainty and congestion, two 
issues we will return to below. We conclude the chapter by discussing an 
emerging health care application of facility location models that has nothing 
to do with the location of new physical facilities. We see such applications 
and adaptations of existing models as an important area for future research. 

3.2 BASIC LOCATION MODELS 

In this section we review three classic facility location models that form the 
basis for almost all of the facility location models that are used in health care 
applications. These are the set covering model, the maximal covering 
model, and the P -median model. 
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All three models are in the class of discrete facility location models, as 
opposed to the class of continuous location models. Discrete location 
models assume that demands can be aggregated to a finite number of 
discrete points. Thus, we might represent a city by several hundred or even 
several thousand points or nodes (e.g., census tracts or even census blocks). 
Similarly, discrete location models assume that there is a finite set of 
candidate locations or nodes at which facilities can be sited. Continuous 
location models assume that demands are distributed continuously across a 
region much the way peanut butter might be spread on a piece of bread. 
These models do not necessarily assume that demands are uniformly 
distributed, though this is a common assumption. Likewise, facilities can 
generally be located anywhere in the region in continuous location models. 
Throughout this chapter we restrict our attention to discrete location models 
since they have been used far more extensively in health care location 
problems. 

At the heart of the set covering and maximal covering models is the notion 
of coverage. Demands at a node are generally said to be covered by a 
facility located at some other node if the distance between the two nodes is 
less than or equal to some exogenously specified coverage distance. 
Typically, the coverage distance is the same for all demand nodes, though 
additional restrictions on the set of candidate locations that can cover any 
particular demand node may be added. Such additional restrictions might 
reflect the ease of travel between population centers and a candidate site for 
a local clinic. For example, significant elevation changes might be penalized 
relative to flat terrain [9, 10]. Whether or not additional restrictions are 
placed on the cover sets, the mathematics is basically the same. 

We define an indicator variable as follows: 

{ 1 i f demand node i can be covered by a facility at candidate site j 
0 if not 

The set covering model [11] attempts to minimize the cost of the facilities 
that are selected so that all demand nodes are covered. To formulate this 
model, we need the following additional sets and inputs. 

/ = set of demand nodes 
J = set of candidate facility sites 
f j = fixed cost of locating a facility at candidate site j 



In addition, we need the following decision variable. 
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1 if we locate at candidate site j 
Xj {Q if not 

With this notation, we write the set covering problem as follows: 

Minimize ^fjXj (1) 



Subj ect to ^aijXjZ 1 Vie/ (2) 

& 



Xje{ 0,1} VjeJ (3) 

The objective function (1) minimizes the total cost of all selected facilities. 
Constraint (2) stipulates that each demand node must be covered by at least 
one of the selected facilities. The left hand side of (2) represents the total 
number of selected facilities that can cover demand node i. Finally, 
constraints (3) are standard integrality conditions. 

In location problems, we are often interested in minimizing the number of 
facilities that are located, and not the cost of locating them. Such a situation 
might arise when the fixed facility costs are approximately equal and the 
dominant costs are operating costs that depend on the number of located 
facilities. In that case, the objective function becomes: 

Minimize ^X j (4) 

jeJ 

To distinguish between these two model variants, we will refer to the 
problem with (1) as the objective function as the set covering problem or 
model; when (4) is used, we will call the problem the location set covering 
problem. A number of row and column reduction rules can be applied to the 
location set covering problem to reduce the size of the problem. Daskin [4] 
discussed and illustrated these rules. 

In practice, at least two major problems occur with the set covering model. 
First, if (1) is used as the objective function, the cost of covering all 
demands is often prohibitive. If (4) is used as the objective function, the 
number of facilities required to cover all demands is often too large. 
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Second, the model fails to distinguish between demand nodes that generate a 
lot of demand per unit time and those that generate relatively little demand. 
Clearly, if we cannot cover all demands because the cost of doing so is 
prohibitive, we would prefer to cover those demand nodes that generate a lot 
of demand rather than those that generate relatively little demand. These 
two concerns motivated Church and ReVelle [12] to formulate the maximal 
covering problem. This model requires the following two additional inputs 

hi = demand at node i 
P = number of facilities to locate 



as well as the following additional decision variable 

1 if demand node i is covered 
0 if not 



With this additional notation, the maximal covering location problem can be 
formulated as follows: 



Maximize 


'ZhlZi 

iel 




(5) 


Subject to 




Vie/ 


(6) 




I Xj = P 

jeJ 




(7) 




X y € {0,l} 


Y/e J 


(8) 




Z/e{ 0,1} 


Vie / 


(9) 



The objective function (5) maximizes the number of covered demands. 
Again, it is important to note that this model maximizes demands that are 
covered and not simply nodes. Constraint (6) states that demand node i 
cannot be counted as covered unless we locate at least one facility that is 
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able to cover the demand node. Constraint (7) states that exactly P facilities 
are to be located and constraints (8) and (9) are standard integrality 
constraints. 

A variety of heuristic and exact algorithms have been proposed for this 
model. In our experience, Lagrangian relaxation [13, 14] provides the most 
effective means of solving the problem. When constraint (6) is relaxed, the 
problem decomposes into two separate problems: one for the coverage 

variables and one for the location variables. The subproblem for the 
coverage variables can be solved by inspection and the location variable 
subproblem requires only sorting. This approach can typically solve 
instances of the problem with hundreds of demand nodes and candidate sites 
to optimality in a few seconds or minutes on today’s computers even though 
the problem is technically NP-hard [15, 16]. Schilling, Jayaraman and 
Barkhi [17] reviewed the general class of location covering models. 

The P-center model addresses the problem of needing too many facilities to 
cover all demands by relaxing the service standard (i.e., by increasing the 
coverage distance). This model finds the location of P facilities to minimize 
the coverage distance subject to a requirement that all demands are covered. 
Daskin [4] provided a traditional formulation of this problem. More 
recently, Elloumi, Labb6 and Pochet [18] presented an innovative 
formulation of the problem that exhibits improved computational 
characteristics when compared to the traditional formulation. 

The three models outlined so far - the location set covering model, the 
maximal covering location model, and the P-center model - treat service as 
binary: a demand node is either covered or not covered. While the notion of 
coverage is well established in health care applications, in many cases we 
are interested in the average distance that a client has to travel to receive 
service or the average distance that a provider must travel to reach his/her 
patients. To address such problems we turn to the P-median problem [19, 
20], which minimizes the demand weighted total (or average) distance. To 
formulate this problem, we need the following additional input 

djj = distance from demand node i to candidate location j 
as well as the following new decision variable 

f 1 if demands at node i are assigned to a facility at candidate site j 
YiJ ~ 1 0 if not 
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With this notation, the P-median problem can be formulated as follows: 



Minimize 


'L'LhidijYij 

jeJiel 




(10) 


Subject to 


I^ =1 

jeJ 


ViG/ 


(11) 




Yi]-Xj< 0 


Vi'g /;Y/g J 


(12) 




I Xj = P 

j<=J 




(13) 




X/e{0,l} 


V /eJ 


(14) 




Yje {0,1} 


V/g /;V/g J 


(15) 



The objective function (10) minimizes the demand weighted total distance. 
This is equivalent to minimizing the demand weighted average distance 
since the total demand is a constant. Constraint (11) states that each demand 
node must be assigned to exactly one facility site. Constraint (12) stipulates 
that demand nodes can only be assigned to open facility sites. Constraint 
(13) is identical to (7) above and states that we are to locate exactly P 
facilities. Constraints (14) and (15) are standard integrality constraints. 
Constraint (15) can be relaxed to a simple non-negativity constraint since 
each demand node will naturally be assigned to the closest open facility. 

As in the case of the maximal covering problem, a variety of heuristic 
algorithms have been proposed for the P-median problem. The two best- 
known algorithms are the neighborhood search algorithm [21] and the 
exchange algorithm [22]. More recently, genetic algorithms [23], tabu 
search [24, 25] and a variable neighborhood search algorithm [26] have been 
proposed for this problem. Correa et al. [27] developed a genetic algorithm 
for a capacitated P-median problem in which each facility can serve a 
limited number of demands. They compared their algorithm to a tabu search 
algorithm and found that the genetic algorithm slightly outperformed the 
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tabu search approach when the GA was accompanied by a heuristic 
hypermutation procedure. The latter simply performs an exchange algorithm 
on selected elements of the initial GA population and on population 
elements at a small number of randomly selected generations. 

For moderate-sized problems, Lagrangian relaxation works quite well for the 
uncapacitated P-median problem. Constraint (11) is relaxed resulting in a 
set of subproblems for each candidate node that can easily be solved by 
inspection. Daskin [4] outlined the use of Lagrangian relaxation for both the 
P-median problem and the maximal covering model in detail. Daskin [28] 
reported solution times for a Lagrangian relaxation algorithm for the P- 
median and vertex P-center problems with up to 900 nodes. 

Some authors have transformed the maximal covering problem into a P- 
median formulation. This can be done by replacing the distance between 
demand node i and candidate site j by the following modified distance: 

if dtj > D c 

if not 




where Q c denotes the coverage distance. This has the effect of minimizing 
the total uncovered demand which is equivalent to maximizing the covered 
demand. 



The uncapacitated fixed charge location (UFL) problem is a close cousin of 
the P-median problem. The UFL problem is derived from the P-median 
problem by eliminating constraint (13) and adding the objective function (1) 
to objective function (10) multiplied by a suitable constant to convert 
demand-miles into cost units. The problem then becomes that of 
determining the optimal number of facilities as well as their locations and 
the allocation of demands to those facilities to minimize the combined fixed 
facility location costs and the transport costs. 

3.3 LOCATION MODELS IN HEALTH CARE 

Having formulated three basic location models (the set covering model, the 
maximal covering model and the P-median model) and having qualitatively 
discussed two other classical models (the P-center problem and the 
uncapacitated fixed charge model) we now turn to applications and 
extensions of these models in health care. The health care location literature 
has tended to address three major topics, which we refer to as accessibility, 
adaptability and availability. 
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By accessibility we mean the ability of patients or clients to reach the health 
care facility or, in the case of emergency services, the ability of the health 
care providers to reach patients. Papers that deal with accessibility tend to 
ignore the needs of the system to evolve in response to changing conditions 
as well as short-term fluctuations in the availability of service providers as a 
result of their being busy serving other patients. Papers that focus on 
availability tend to be direct applications of one or more of the models above 
or are minor extensions of these models. 

It is relatively easy and straightforward to site facilities based on a snapshot 
of the current or recent past conditions. Unfortunately, there is no guarantee 
that the future will replicate the past. Predicting future demand rates and 
operating conditions is exceptionally difficult. Thus, some recent 
applications and modeling efforts have focused on identifying solutions that 
can be implemented in the short term but that can adapt to changing future 
conditions relatively easily. 

For some health care systems, and for emergency services in particular, 
some portion of the nominal capacity is likely to be unusable by new 
demands at any point in time as it is already in use by current demands. 
Thus, an ambulance may be busy responding to one emergency when 
another call for service within its district arises. To handle such situations, a 
significant literature has focused on designing systems to maximize some 
measure of the availability of the servers. 

In short, accessibility models tend to take a snapshot of the system and plan 
for those conditions. As such, they are static models. Adaptability models 
often consider multiple future conditions and try to find good compromise 
solutions. As such, they tend to take a long-term view of the world. 
Availability models focus on the short-term balance between the ever- 
changing demand for services and the supply of those services. 

3.3.1 Accessibility models and applications 

Accessibility models attempt to find facility locations that perform well with 
respect to static inputs. In particular, demand, cost and travel distance or 
travel time data are generally assumed to be fixed and non-random in this 
class of models. Thus, the models are often relatively straightforward 
extensions of the classic models outlined in section 1 above. 

Indeed, federal legislation has encouraged the use of such models. The EMS 
(Emergency Medical Services) Act of 1973 stipulated that 95% of service 
requests had to be served within 30 minutes in a rural area and within 10 
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minutes in an urban area. This encouraged the use of models like the 
maximal covering model. Eaton et al. [29] used the maximal covering 
model to assist planners in Austin, TX in selecting permanent bases for their 
emergency medical service. The model was solved using the greedy adding 
and greedy adding and substitution algorithms. More recently, Adenso-Diaz 
and Rodriguez [30] also used the model to locate ambulances in Leon, 
Spain. They developed a tabu search algorithm to solve the problem. 

Sinuany-Stem et al. [31] and Mehrez et al. [32] used two discrete models, 
the P-median model and a variant of the fixed charge location model in 
which they constrained the travel time to any hospital and also invoke 
penalties for the assignment of demand to a hospital in excess of the 
hospital’s capacity. These models were used, along with qualitative 
techniques, to generate alternative locations, which were then analyzed using 
the analytic hierarchy method. It is worth noting that the sites that were 
ultimately preferred tended to be those that were identified using one or 
more of the analytic methods, as opposed to those identified using 
qualitative techniques. 

Jacobs, Silan and Clemson [33] used a capacitated P-median model to 
optimize collection, testing and distribution of blood products in Virginia 
and North Carolina. McAleer and Naqvi [34] also used a /’-median model, 
in this case to relocate ambulances in Belfast, Ireland. Their problem was to 
locate four facilities to serve 54 demand nodes. The authors used a heuristic 
approach that decomposed the demand nodes into four sectors and ranked 
the possible single facility locations within each sector. This led to a number 
of acceptable solutions in each sector. All combinations of acceptable 
locations were then evaluated using all 54 demand nodes. While such a 
heuristic decomposition approach may make intuitive sense, it is not 
guaranteed to result in an optimal solution. Modern algorithms (e.g., 
Lagrangian relaxation embedded in branch and bound as implemented in 
SITATION [35]) can readily solve such problems to optimality on today’s 
computers in seconds. Practitioners can also use such models to identify 
near optimal solutions, particularly when the number of facilities being 
located is small. 

In hierarchical location modeling, a number of different services are 
simultaneously located. These might be, for example, local clinics, 
community health centers and regional hospitals. Lower level facilities (e.g., 
clinics) are generally assigned lower numbers (e.g., 1), while the highest 
level facilities (e.g., regional hospitals) are assigned the top number (e.g., 3). 
Another common application of hierarchical modeling is the location of 
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basic life support vehicles (BLS or level 1 facilities) and advanced life 
support vehicles (ALS or level 2 facilities). 

At least three factors need to be considered in hierarchical location problems 
[36]. The first is whether a level m facility can provide only level m service 
or whether or not it can also provide services at all lower levels (1, ..., m). 
Clearly, an ALS vehicle can provide all levels of service that a BLS vehicle 
can provide. It is less clear that a regional hospital will be designed or 
staffed to provide all levels of support provided by a local clinic. For 
example, regional hospitals may not stock flu vaccines and, as such, may not 
be able to vaccinate individuals against the flu, while local clinics may be 
able to do so. A successively inclusive hierarchy is one in which a level m 
service can provide level m and all lower level services, while a successively 
exclusive hierarchy is one in which each level of service is provided by a 
unique facility. The second issue is, in a successively inclusive service, 
whether a level m facility can provide all m levels of service to all demand 
nodes, or a level m facility can provide all m levels of service only to 
demands at the node at which the facility is located and level m service only 
to other nodes. The former is referred to as a successively inclusive service 
hierarchy while the latter is termed a locally inclusive service hierarchy. A 
successively exclusive service hierarchy is one in which a level m facility 
provides only level m service to all nodes. Finally, there will generally be 
fewer high level facilities (e.g., regional hospitals) than low level facilities 
(e.g., local clinics). If high-level facilities can only be located at sites 
housing a lower level facility, the system is termed nested; otherwise it is not 
nested. 

Finally, Price and Turcotte [37] used a center of gravity model to locate a 
blood donor clinic in Quebec. The model was used with a variety of inputs 
to identify a number of different locations from which a final choice was 
made. The center of gravity model minimizes the demand-weighted average 
distance between a facility that can be located anywhere in the plane and a 
discrete set of points. It is in the class of continuous location models (since 
the single facility location can be anywhere in the plane), which we are not 
explicitly reviewing and that have seen relatively little use in the health care 
location field. Nevertheless, Sinuany-Stem et al. [31] and Mehrez et al. [32] 
used two different continuous models in identifying candidate sites for a new 
hospital in the Negev. The first was the Weber model, which minimizes the 
demand weighted average Euclidean distance between a facility that can be 
anywhere in the plane and fixed demand locations, while the second was 
similar but used the square of the Weber objective function. (The reader 
interested in the Weber problem should consult the excellent review by 
Drezner et al.[38]). 
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3.3.2 Adaptability models 

Location decisions must be robust with respect to uncertain future 
conditions, particularly for facilities such as hospitals that are difficult if not 
impossible to relocate as conditions change. A number of approaches have 
been developed to deal with future uncertainty. Scenario planning [39-41] is 
frequently used to handle future uncertainty. A number of future conditions 
are defined and plans are developed that do well in all (or most) scenarios. 

In scenario planning, some decisions are made before the true scenario is 
revealed while others can be made after knowledge of the true scenario is 
gained. In location planning, the facility sites must generally be chosen 
before we know which scenario will evolve; the assignment of demand 
nodes to sites can generally be done after we know which scenario will 
occur. 

Designing a robust system often entails compromises. The “best” 
compromise plan may not be optimal under any particular scenario but will 
do well across all scenarios. The regret associated with a compromise 
solution and a scenario measures the difference between the performance 
measure using the compromise solution for that scenario and the 
performance measure when the optimal solution is used for that scenario. 

Three performance measures are often used in scenario planning: 
optimizing the expected performance, minimizing the worst case 
performance, and minimizing the worst case regret. Minimizing the 
expected regret is identical to optimizing the expected performance. 

In what follows, we formulate scenario-based extensions to the P-median 
problem. We define the following additional set and input 

S = set of scenarios 

q s = probability that scenario s will occur 

With this additional notation, the problem of minimizing the expected 
demand weighted total distance is formulated as follows, where we have 
added the subscript s to the demands and distances as well as the allocation 
variables: 
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The objective function (16) minimizes the expected demand weighted total 
distance over all scenarios. Constraint (17) states that each demand node is 
assigned to a facility in each scenario. Constraint (18) stipulates that these 
assignments can only be made to open facilities. Constraints (19) and (20) 
are identical to (13) and (14), respectively, and (21) is a standard integrality 
constraint. 

To minimize the worst-case performance, the problem is restructured as 
follows: 

Min W (22) 

Subject to W - £ X hi, dip Y Js ^ 0 Vs e S (23) 

JeJief 



and (17) -(21) 
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where W is the maximum demand weighted total distance across all 
scenarios. 

Finally, to minimize the maximum regret, we solve the following problem: 

Min V (24) 



Subject to V - 



2 i 'Lh b dj,Y js -Vs\z 0 

JeJiel 



VseS 



(25) 



and (17) -(21) 



where V* s the optimal objective function value (smallest demand 
weighted total distance) for scenario s. 

Both the minimax model (22)-(23) and the minimax regret model (24)-(25) 
avoid the need for scenario probabilities, which can be difficult to estimate. 
However, these models suffer from the fact that an unlikely scenario can 
drive the entire solution. At the other extreme, the problem of minimizing 
the expected performance (or equivalently the expected regret) tends to 
undervalue scenarios in which the compromise solution performs poorly if 
those scenarios are low probability events. To handle these problems, 
Daskin, Hesse and ReVelle [42] introduced an a-reliable minimax regret 
model. The model minimizes the maximum regret over an endogenously 
determined subset of the scenarios whose total probability must be at least a. 

Carson and Batta [43] considered the problem of locating an ambulance on 
the campus of the State University of New York at Buffalo in response to 
changing daily conditions. This is a particular problem on a large university 
campus since the center of gravity of the population shifts from dormitories 
to classrooms and offices over the course of the day. They determined that 
modeling four different time periods would suffice. By relocating the 
ambulance for each period, they were able to reduce the predicted average 
response time by 30% from 3.38 minutes (with a single static location) to 
2.28 minutes (with four periods of unequal duration). The actual decrease in 
travel time when the solution was implemented was closer to 6% with the 
difference attributed to the non-linear nature of travel times. This work 
should not technically be viewed as part of the scenario planning literature 
since the decisions for each time period are unlinked. However, the work 
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does highlight the value of being able to modify ambulance locations in 
response to changing daily conditions. The work also emphasized the need 
for careful modeling of travel time relationships, particularly when the 
average time is likely to be small. 

ReVelle, Schweitzer and Snyder [44] proposed a number of variants of a 
conditional covering model in which demands at a node that houses a facility 
must be covered by a facility located elsewhere. In such models, the original 
demand nodes must be covered and each facility located by the model must 
be covered by a different facility. The rationale for such models is that if an 
emergency occurs at node j (e.g., an earthquake), then any emergency 
services at that location must be assumed to be damaged or unavailable for 
service at that node. Therefore, the node must be covered by some other 
facility. 

In many important cases, the actual number of facilities that can be 
constructed in the long term is uncertain when the planning begins. Then, it 
is often important to be able to locate a known number of facilities now, 
accounting for the possibility that additional facilities could be built in future 
years. Current, Ratick and ReVelle [45] addressed this uncertainty with two 
models. In each model, the first stage decision entails locating P 0 facilities 
now and P s facilities in future state s, which occurs with probability rc s . The 
objective of the first problem is to minimize the expected opportunity loss 
(or regret) while the second problem minimizes the maximum regret. They 
illustrated the results using a small problem with 20 nodes, of which 10 were 
candidate facilities, and 4 future states allowing for 0, 1, 2, or 3 additional 
facilities to be constructed. The models were solved using a standard LP/IP 
solver on a personal computer. 

3.3.3 Models of facility availability 

Adaptability reflects long-term uncertainty about the conditions under which 
a system will operate. Availability, in contrast, addresses very short-term 
changes in the condition of the system that result from facilities being busy. 
Such models are most applicable to emergency service systems 
(ambulances) in which a vehicle may be busy serving one demand at the 
time it is needed to respond to another emergency. 

Deterministic models One simple, but somewhat crude, way of dealing with 
vehicle busy periods is to find solutions that cover demand nodes multiple 
times. The Hierarchical Objective Set Covering (HOSC) model [46] first 
minimizes the number of facilities needed to cover all demand nodes. Then, 
from among all the alternate optima to this problem - and there often are 
multiple alternate optima - the model selects the solution that maximizes the 
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system-wide multiple coverage. The multiple coverage of a node is given by 
the total number of times the node is covered in addition to the one time 
needed to satisfy the set covering requirement. The system-wide multiple 
coverage is the sum of the nodal multiple coverage over all nodes. In 
essence, the model introduces an explicit surplus variable into constraint (2) 
and maximizes the sum of the surplus variables as a secondary objective to 
objective (4). 

Benedict [47] modified the HOSC model to account for node demands and 
termed this excess coverage. To do so, he weighted the surplus variable by 
the node’s demand. Eaton et al. [48] independently formulated and solved a 
similar model for locating ambulances in Santo Domingo. Hogan and 
ReVelle [49] considered a similar model that they termed backup coverage 
in which only a single additional cover of each node was counted and the 
additional cover of the node was weighted by the demand at the node. 

Benedict also modified the maximal covering model to account for excess 
coverage. In this model the primary objective is to maximize the covered 
demand while the secondary objective is to maximize the excess coverage in 
the system. Benedict’s third model was termed the hierarchical objective 
excess coverage model. In this model, the primary objective is to maximize 
excess coverage within T time units using the minimum number needed to 
cover all demand within T; the secondary objective is to maximize the 
demand that is covered within S, where S is less than T. Daskin, Hogan and 
ReVelle [50] reviewed a variety of models of multiple, excess and backup 
coverage as well as the expected covering model discussed below. 

Gendreau, Laporte and Semet [51] considered the problem of maximizing 
the number of demands that are covered by (at least) two ambulances in a 
distance rj < r 2 while ensuring that each demand is covered within r 2 and 
that at least a% of the demand is covered within a*/. A total of P ambulances 
are to be located. Like other multiple coverage models, this formulation is 
designed to increase the likelihood of there being an available ambulance 
within the coverage distance of a demand. Gendreau, Laporte and Semet 
solved the problem using tabu search for problem instances with up to 400 
demand nodes and 70 candidate sites and 45 facilities. 

Pirkul and Schilling [52] developed a model that minimizes the sum of the 
fixed facility costs, the costs of primary service and the costs of secondary 
service. Each demand node must be assigned to both a primary and a 
secondary facility. They developed a Lagrangian heuristic for solving the 
problem. The algorithm was embedded in a branch and bound algorithm to 
ensure optimality. They applied the algorithm to test problems ranging in 
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size from 100 demand nodes and 10 candidate sites to 300 demand nodes 
and 30 candidate locations. They also tested the algorithm on a fire station 
location problem with 30 candidate sites and 625 demand nodes. By varying 
the weight on the fixed cost term of the objective function, the tradeoff 
between the number of facilities located and the average (primary and 
secondary) distance was identified for this larger problem. 

Narasimhan, Pirkul and Schilling [53] considered the problem of locating a 
fixed number of facilities to maximize the amount of covered demand across 
a number of different levels of coverage, subject to a constraint that the total 
demand assigned to a facility across all levels of coverage cannot exceed a 
given value (the facility capacity). The model converts the maximal 
covering model into a P-median model and then introduces multiple levels 
of coverage and facility capacities. They argued that this “service level” can 
represent the order in which the facility providing service is called for 
service at a node. This is somewhat problematic since the order in which a 
facility at node j is called upon to respond to demands at node i depends on 
the location of other facilities, which is determined endogenously. 
Specifying this order exogenously seems extraordinarily difficult. They 
used a Lagrangian approach to solve the problem heuristically relaxing the 
assignment constraints. The authors solved the problem with up to 200 
demand nodes, 30 candidate sites, 5 levels of service and 15 facilities being 
sited. Optimality gaps tended to be small, though for some (smaller) 
problems the maximum gap was 3 percent. 

Probabilistic models The models discussed above take a deterministic 
approach to increasing the likelihood that a demand will be covered by an 
available vehicle or served adequately. Two different probabilistic 
approaches have been developed. The first approach is based on queuing 
theory while the second is based on Bernoulli trials. 

Fitzsimmons [54] approximated the number of busy ambulances using an 
M/G / 00 queuing model. The average service time in his model depends on 
the number of busy vehicles, which, in turn, depends on the average service 
time. Thus, the two quantities are jointly estimated using an iterative 
sampling procedure. This is embedded in a search routine for finding 
improved ambulance locations. Eaton [55] provided an introduction to the 
use of this model in siting ambulances. While Fitzsimmons’ approach can 
readily be embedded in a heuristic facility location model, it does not fully 
account for spatial differences in the probability of a vehicle being busy. 

To address this shortcoming, Larson [56] developed a hypercube queuing 
model that accounts for spatially distributed service systems. The hypercube 
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model is essentially an M/M/N queuing model with distinguishable servers. 
A binary siring whose length is equal to the number of servers represents 
each state of the queuing system. For a system with n servers (ambulances) 
the model requires the solution of 2 n simultaneous linear equations. Larson 
[57] proposed an approximation to the exact hypercube that entails solving n 
non-linear equations. Because of the difficulty in solving these models with 
known locations, they have tended not to be used in optimization modeling. 
Jarvis [58], however, embedded an approximation to the hypercube model in 
a heuristic search algorithm. Brandeau and Larson [59] used the hypercube 
model to locate ambulances in Boston. 

An alternate, though less exact, approach involves representing the 
probability that a vehicle at any site j will be available as the outcome of a 
Bernoulli trial with probability of success (available) of q. Then, assuming 
that the probability q is the same throughout the system, the probability that 

all k vehicles that can cover a demand node i are busy is q * . The 
probability that at least one of these k vehicles is available is \- q^ and the 
incremental probability of at least one being available given that k vehicles 
can cover the demand node rather than just k-1 vehicles is 

M-M-W-'— ,*=,*-■(,- »). 

This argument is at the heart of the maximum expected covering location 
model proposed by Daskin [60, 61]. To formulate this model, we define the 
following decision variable: 

{ 1 if demands at node i are covered by k or more vehicles 
0 if not 

With this notation, the maximum expected covering model can be 
formulated as follows: 

Max X X h * ft " ?) Yik = ft " ?)X X hi f - 1 Yik < 2 

k=\iel k=\iel 



P 

Subject to £ Y ik - ]T aj Xj<Q 

k = 1 jeJ 



Vie/ 



(27) 
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Yxj= p 




(28) 


jej 








V /€ J 


(29) 


Y ik e{ 0.1} 


Vie /;fc = l,...,P 


(30) 



Under the independence assumption implicit in the Bernoulli trials model 
and the assumption that a single system-wide probability of a vehicle being 
busy (q) can be estimated, the objective function (26) maximizes the 
expected demand covered by an available vehicle. Constraint (27) links the 
location variables to the coverage variables and states that a demand node 
cannot be counted as being covered k time unless there are at least k vehicles 
that can cover the node. Constraint (28) states that exacdy P vehicles are to 
be located. Constraint (29) states that an integer number of vehicles must be 
located at any node, and constraint (30) states that the counting variables 
( Yjfc ) are binary. Note that constraint (29) does not restrict the number of 
vehicles at any location to be either 0 or 1. Daskin [61] proposed an 
exchange-based heuristic that approximates the solution for all values of q, 
the probability of a vehicle being busy. 

Repende and Bernardo [62] extended the maximal expected covering 
location model to incorporate different time periods. The model allowed 
planners to reduce ambulance response time in Louisville, Kentucky, by 
36%. They used simulation to validate the results of the time- variant 
expected covering model and to get better approximations of the actual 
expected coverage. 

The maximum expected covering location model has two major limitations. 
Batta, Dolan and Krishnamurthy [63] showed that the independence 
assumption does not generally hold. They propose a number of ways of 
handling this including a formulation of an adjusted maximum expected 
covering location model that uses a correction term similar to that used by 
Larson [57] in developing the hypercube queuing model approximation. 
The second limitation of the maximum expected covering model has to do 
with the computation of the system- wide busy probability. Daskin [61] 
suggested computing system-wide busy probability as 



24 - P 



where t = average service time (in hours). 
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ReVelle and Hogan [64, 65] extended the computation of the system- wide 
busy period to account for local conditions by approximating 



?/ = 



*• 

24 

J*N, 



_ Ft 

j*Ni 



(31) 



where 

A// = set of demand nodes that are within the coverage distance of node i 



7 ^ = set of candidate sites that can cover demand node i and 



q. = Probability that a vehicle located at i will be busy 

With this local busy probability, ReVelle and Hogan [64] formulated the 
probabilistic set covering model as follows: 



Minimize T.X / 
jeJ 



(32) 



Subject to £ ayXj=bi 
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In this model, node i must be covered bi times, where b t is the is the smallest 
value satisfying 
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and a is the required probability of a node being covered by an available 
vehicle. Thus, this model is essentially a set covering model except that the 
right hand side of (33) is greater than 1 and we can locate more than one 
vehicle at a node. The model finds the minimum number of vehicles 
required to ensure that each demand node is covered by an available vehicle 
with probability a, using the local busy probability estimates given by (31). 

ReVelle and Hogan [64J defined the a-reliable P-center problem and the 
maximum reliability location problem. The a-reliable P-center problem 
finds the smallest coverage distance such that all demands are covered with 
probability a by an available vehicle. This is solved by solving the problem 
above (32)-(34) for successively smaller values of the coverage distance 
until the objective function exceeds P. The maximum reliability location 
problem is to find the locations of P facilities such that the reliability a is 
maximized. This can be solved by fixing a feasible value of a and then 
solving the problem above. The value of a is then increased until the 
required number of vehicles increases above P. 

Similarly, ReVelle and Hogan [65] formulated the maximum availability 
location problem (MALP) as the problem of locating P vehicles to maximize 
the number of demands that are covered by an available vehicle with 
probability at least a. Using the notation defined above, this model 
becomes: 



Maximize 


(35) 


iel 
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Yute{ 04} Vie = (40) 

The objective function (35) maximizes the total demand that is covered by 
an available vehicle with probability at least a. Constraint (36) links the 
coverage and location variables. Constraint (37) states that a node cannot be 
counted as being covered k times unless it is also counted as being covered 
k-1 times. This constraint is not needed in the maximum expected covering 
problem since the decreasing value of the objective function coefficients for 
0<q<I ensures that the coverage variables will enter the solution in this 
order. Constraint (38) states that P vehicles are to be located. Constraints 
(39) and (40) are integrality constraints. Again, we do not limit the number 
of vehicles located at a node to 0 or 1. 

Ball and Lin [66J developed a model that is similar to the maximum 
availability location problem of ReVelle and Hogan [65], but do so from 
first principles. This helps identify the assumptions necessary for the 
development of the model. They then outlined a number of constraints that 
can be added to the formulation to tighten its linear programming relaxation, 
thereby facilitating the solution of the problem. 

Goldberg et al. [67] developed a highly non-linear model that accounts for 
vehicle busy periods as a function of assignments. Assignments are for the 
Jd h vehicle to respond to a demand in a region. The model was solved 
heuristically and was applied to the location of ambulances in Tucson, AZ. 
The model objectives include maximizing the number of calls responded to 
in 8 minutes (success rate), maximizing the worst node’s success rate, and 
balancing workload. The approach was used primarily to evaluate a given 
set of sites though they did do some limited experimentation with an 
exchange algorithm. 

Mandell [68] formulated a hierarchical ambulance location model in which 
demands are not covered unless either (1) a basic life support (BLS) unit can 
arrive at the scene within f and an advanced life support unit (ALS) can 
arrive within f* with f i >t B or (2) an ALS unit can arrive within f. The 
model was formulated in terms of the probability that a demand is served 
adequately given that there are r ALS units within f 4 , r' ALS units within f 
and s BLS units within f*. Mandel used a two-dimensional Markov model 
(with states representing the number of ALS units within t 4 of a demand 
node and the number of BLS units within f of the node) to estimate the 
required probabilities. The Markov model used demand-area specific arrival 
rates. The model was tested on a 55-node network. Computation times 
were under 1.5 seconds in all cases for the IP problem as formulated. 
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In the models described above, the primary objective was to account for 
vehicle busy periods. Another source of randomness arises from the location 
of the demands. Recognizing that demands occur over a region and not at 
discrete points, Aly and White [69J considered a probabilistic extension of 
the set covering model and of the P-median model. In both models the 
location of demands is uncertain, making the travel times random variables. 
Demand locations are uniformly distributed in rectangular regions. The 
distribution of travel time to a random point from a base with given 
coordinates is derived. From this the probability of being able to cover 
demands in a region from the base within a given time limit is derived. This 
results in the probabilistic set covering model - minimize the number of 
facilities need to ensure that each region is covered with probability y - 
becoming a standard set covering model. Similarly, once we have the 
distribution of travel times, we can compute the expected travel time from a 
base at j with known coordinates to a point that is randomly distributed in 
some rectangular region /. This makes the probabilistic P-median problem - 
minimize the demand weighted expected travel time - a standard P-median 
problem as well. They concluded that the probabilistic formulation requires 
more facilities than does the deterministic formulation. Specifically, they 
stated, “In summary, using an aggregate point to represent a densely 
populated area may yield a less expensive siting cost. However, by ignoring 
the probabilistic element the actual service level will be much less than the 
one anticipated by the decision-maker.” (p. 1 176) 

Whether it arises from uncertain demand locations, vehicle busy periods, or 
changing and uncertain underlying conditions, stochasticity will degrade the 
performance of the system for a fixed set of resources. 

3.4 ANOTHER APPLICATION OF LOCATION MODELS IN 
HEALTH CARE 

The location set covering model - objective function (4) subject to (2) and 
(3) - has recently been used in a new health care application. Laporte et al. 
[70] reported on the use of this model to determine the minimum number of 
fields of view (FOV) to read a cytological sample (PAP test). A field of 
view is the area that a microscope can see without moving the slide being 
analyzed. All areas of interest on a slide need to be examined (i.e., need to 
be in at least one FOV). At the same time, one would like to minimize the 
number of required FOVs so as to minimize the time needed to analyze each 
sample. 

While the set covering model used by Laporte et al. is identical to that used 
in the location problems discussed above, there is an important difference. 
Typical location problems involve several hundred demand nodes and 
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candidate locations. Solution time is not generally a problem in these 
instances because the problems are small and they do not have to be solved 
in real time. In the cytological example, the number of points to be covered 
can range from 2,500 to 55,000, approximately two orders of magnitude 
more than is typically found in a facility location example. Furthermore, the 
problems have to be solved very quickly as decisions about how to read a 
sample need to be made in real time. Furthermore, once appropriate FOVs 
have been identified, a routing problem needs to be solved to guide the 
microscope from one FOV to the next. 

Laporte et al. [70] employed a series of heuristics to attack the problem. 
First, a mesh of FOVs was generated to cover all of the points of interest. 
Within each square, the smallest rectangle containing all of the points in the 
square was identified and up to four additional FOVs were generated, one 
located at each of the comers of this rectangle. A number of heuristics were 
then used to identify FOVs to include in the solution and others that could be 
excluded. Then a greedy heuristic proposed by Balas and Ho [71] was 
applied to solve the remaining problem. The routing heuristic was a 
straightforward adaptation of the strip heuristic proposed by Daganzo [72]. 
Solution times for the combined heuristic were typically under two minutes 
and thus were satisfactory for this application. 

Brotcome, Laporte and Semet [73] subsequently developed even faster 
heuristics for the tiling problem. It is worth noting that the best results in 
terms of a compromise between solution quality and execution time were 
generally not those that involve using the heuristic solution to the set 
covering model; instead, they used a variety of improvement heuristics. 

3.5 SUMMARY AND DIRECTIONS FOR FUTURE WORK 

In this chapter we have presented the formulations of three location models 
that underlie most of the facility location models used in health care. The set 
covering model finds the minimum number (or cost) of facilities needed to 
cover all demands within a specified time or distance. The maximal 
covering location model relaxes the condition that all demands must be 
served within the covering standard and maximizes the number of covered 
demands using a fixed number of facilities. Finally, the P-median model 
drops the notion of coverage and minimizes the demand-weighted total 
distance between demand nodes and the nearest facilities. 

We identified three approaches to location modeling that have been used in 
health care applications. Accessibility models are typically straightforward 
extensions or applications of one of the basic location models. The goals of 
accessibility models are generally to maximize coverage or to minimize 
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average distance. Adaptability models recognize that future conditions are 
difficult, if not impossible, to predict. These models attempt to find 
solutions that perform well across a range of future scenarios. Generally, a 
single set of locations must be identified for all scenarios, but the assignment 
of demands to facilities can be scenario-dependent. Typical objectives 
include optimizing the expected system performance, minimizing the worst- 
case performance, and minimizing the maximum regret. Regret measures 
the difference in the performance of the system for a given scenario between 
the compromise solution and the solution that would have been optimal for 
the specified scenario. Availability models attempt to account for the short- 
term unavailability of vehicles or facilities. Many such models have been 
applied to ambulance location problems. An ambulance might not be 
available when called upon for service because it is already serving another 
demand. A variety of deterministic, queuing-based and probabilistic 
availability models were reviewed. 

We also outlined a health care application of the set covering model that 
results in problems that are approximately two orders of magnitude bigger 
than typical location problems and that has to be solved in real time. The 
application has to do with screening cytological samples and finding the 
minimum number of fields of view needed to read a sample. 

In our view, the accessibility literature and the availability literature are quite 
mature, at least as applied to health care location problems. Considerably 
less work has been done on applying well-known concepts of scenario 
planning, or adaptability modeling, to health care problems. This seems to 
be a potentially fertile area for future work. Related to this is the area of 
reliability modeling. Reliability differs from adaptability in that adaptability 
(or robustness as it is sometimes termed) refers to the ability of a system to 
perform well in the face of uncertain future conditions. The uncertainty is 
typically in the input conditions including the costs and demands. 
Reliability, on the other hand, refers to the ability of a system to perform 
well when parts of the designed system fail [74]. Failures might result from 
capacity limitations or simply facility closures. Menezes, Berman and Krass 
[75] discussed reliability problems associated with Toronto hospitals. They 
noted that it is common for emergency rooms to be at capacity and to request 
that the city wide system redirect emergencies to some other facility. Also, 
some hospitals were actually closed due to the SARS outbreak. Daskin and 
Snyder [76] presented two extensions of the P-median model designed to 
consider reliability, while Snyder [74] formulated and solved a variety or 
reliability extensions to location models. We believe that adaptability, 
robustness and reliability will become increasingly important in future 
applications in health care. 
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Finally, the application of location constructs to problems that do not involve 
locating any facilities seems to be an exciting area for future research and 
development. The use of location models in improving the efficiency of 
cytological diagnostic procedures outlined above is but one example of this 
line of research. Another application involves locating radioactive sources 
or seeds in the treatment of prostate cancer [77]. Applications of facility 
location-like models in the diagnosis and treatment of medical conditions is 
likely to be an important area of future work. 
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SUMMARY 

The ambulance-planning problem includes operational decisions such as 
choice of dispatching policy, strategic decisions such as where ambulances 
should be stationed and at what times they should operate, and tactical 
decisions such as station location selection. Any solution to this problem 
requires careful balancing of political, economic and medical objectives. 
Quantitative decision processes are becoming increasingly important in 
providing public accountability for the resource decisions that have to be 
made. This chapter discusses a simulation and analysis software tool 
‘BartSim’ that was developed as a decision support tool for use within the 
St. John Ambulance Service (Auckland Region) in New Zealand (St. Johns). 
The novel features incorporated within this study include 

- the use of a detailed time-varying travel model for modelling travel 
times in the simulation, 

- methods for reducing the computational overhead associated with 
computing time-dependent shortest paths in the travel model, 

- the direct reuse of real data as recorded in a database (trace-driven 
simulation), and 

- the development of a geographic information sub-system (GIS) within 
BartSim that provides spatial visualisation of both historical data and 
the results of what-if simulations. 

Our experience with St. Johns, and discussions with emergency operators in 
Australia, North America, and Europe, suggest that emergency services do 
not have good tools to support their operations management at all levels 
(operational, strategic and tactical). Our experience has shown that a 
customized system such as BartSim can successfully combine GIS and 
simulation approaches to provide a quantitative decision support tool highly 
valued by management. Further evidence of the value of our system is 
provided by the recent selection of BartSim by the Metropolitan 
Ambulance Service for simulation of their operations in Melbourne, 
Australia. This work has led to the development of BartSim’ s successor. 
Siren (Simulation for Improving Response times in Emergency Networks), 
which includes many enhancements to handle the greater complexities of the 
Melbourne operations. 

KEY WORDS 

Ambulance service planning, Simulation 
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4.1 INTRODUCTION 

In 1997 we were contacted by the St. Johns Ambulance Service (Auckland 
region) in New Zealand, henceforth referred to as St. Johns. St. Johns 
wanted assistance in developing rosters for their ambulance personnel. This 
initial contact led to our study of ambulance service management, and to the 
development of a comprehensive simulation and analysis tool to assist in 
decision making. (We should emphasize that here the word “simulation” 
refers to a computer software tool, and not to the replication of realistic 
incident conditions where volunteers pretend to have certain injuries.) This 
chapter reviews some of the issues faced by St. Johns managers, and indeed 
ambulance service managers all over the world, and discusses the methods 
and tools that we developed to assist them. 

The manager of an ambulance service faces a host of difficult policy 
questions related to operation of the service. The following list is only a 
sample. 

- How many ambulances should be employed and where should they be 
stationed? 

- What policies and procedures should be followed as calls for assistance 
are received in order to ensure rapid response to calls while obtaining 
quality information to allow appropriate dispatching? 

- Should ambulances be used for non-urgent patient transfers in addition to 
the usual emergency response function? 

- How should dispatching decisions be made when multiple vehicles are 
available for dispatch? 

- How can one examine the tradeoffs associated with sharing a limited 
number of ambulances between a high-demand metropolitan area and a 
low-demand rural area? Here the issue is “fairness” in the sense of 
coverage, versus “efficiency” in the sense of placing ambulances where 
they will be in high demand. 

This is a rather daunting list of problems, to which a great deal of research 
effort has been focused in the past. Swersey [1] provides a survey of work in 
emergency service planning that serves as an excellent entry point for the 
literature. There is a very large literature on such problems, so one might 
very well ask, what is the motivation for revisiting these problems? 

A key difference between the ambulance-planning problem as faced before 
1994 and the problem as faced today is the prevalence of data. Virtually all 
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ambulance operations now employ some form of computer-aided dispatch 
(CAD) system that automatically logs the details of calls as they are 
received. This information is a veritable goldmine for planners! Without 
CAD data, ambulance studies typically relied on manual collection of data; 
see, for example, Swoveland et al. [2], where some of the required data was 
manually recorded over a period of two weeks. 

A second factor that motivated much of the developments discussed in this 
chapter is the difference in the questions that are being asked. Much of the 
early development of ambulance theory focused on the questions of where 
and when ambulances should be operated. While this question is central to 
much of what we do, we are also motivated by “finer granularity” questions 
such as how call taking and dispatching should be performed. 

To answer these and other questions at St. Johns, we developed a discrete- 
event simulation of ambulance operations. By manipulating the parameters 
of the simulation, it is possible to address, in a quantitative manner, many of 
the questions mentioned earlier. The flexibility of discrete-event simulation 
means that one can avoid simplifying assumptions that are otherwise needed 
to obtain performance measure predictions using other methods, such as 
queueing theory or Markov chain analysis. Perhaps the biggest advantage of 
simulation is that it is easy to explain as a decision tool to both managers and 
frontline personnel, so that after they understand the model, they place great 
store in its results. Obtaining this “buy-in” from decision makers and 
frontline personnel is crucial in moving from model predictions to decisions 
and implementation. 

To reinforce these points, consider the hypercube model as surveyed in 
Larson and Odoni [3], and the specialization of this model to ambulance 
planning in Brandeau and Larson [4J. The hypercube model, while 
possessing great predictive power, also requires several assumptions with 
regard to the way that ambulances are dispatched, gives only steady-state 
results, and requires certain assumptions about the form of “service time” 
distributions, at least in the case where calls queue when all units are busy. 
Moreover, explaining how it works to managers is a somewhat daunting 
task, so that it is hard to instill a feeling of confidence in decision makers as 
to its predictions. In spite of these disadvantages, it seems to work very well 
in practice, so it remains a powerful modeling approach that, for a subset of 
the questions considered here, is a viable alternative to simulation. 

Of course, simulation is not new to the ambulance-planning problem. Early 
examples are Savas [5] for ambulance operations in New York City, and 
Fitzsimmons [6, 7] for operations in the San Fernando Valley in Los 
Angeles. Swoveland et al. [2] used simulation to fit the parameters of a 
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metamodel that predicts expected ambulance response time. The expected 
response time as predicted by the metamodel was then optimized using 
branch and bound. Simulation was used by Fujiwara et al. [8] to carefully 
examine a small number of alternative plans that were obtained from an 
optimization model developed in Daskin [9J. Lubicz and Mielczarek [10] 
developed a simulation model of rural ambulance operations in Poland. 
Ingolfsson, Erkut and Budge [11] used simulation to help in siting a “single- 
start station,” i.e., a station from which multiple ambulances begin their 
shifts. In addition, the use of simulation as a tool to validate the selections 
of optimization models is almost universal in the literature, and continues to 
this day. For recent examples see Erkut et al. [12], Harewood [13] and 
Ingolfsson, Budge and Erkut [14]. For a recent survey of optimization 
methods in ambulance location problems see Brotcome, Laporte and Semet 
[15]. Larson and Odoni ([3], Chapter 7) discuss general considerations 
related to the simulation of problems similar in form to the ambulance- 
planning problem. 

So what is new in this study? 

First, our simulation directly reuses the data recorded in the CAD database. 
Real calls are fed through the simulation, rather than calls generated using 
the usual simulation techniques. Justification for our use of trace-driven 
simulation and discussion of some of the key issues can be found in Section 
4.3. Such an approach resolves many difficulties, including accurate 
modeling of the complex dependence structure of the information related to 
calls including time of occurrence, location, need for transport and so forth, 
Of course, it also introduces other problems. 

Second, we employ a sophisticated model, adapted from a model developed 
and used by the Auckland Regional Council [16] for regional planning 
purposes, to compute travel times. These travel times are used to determine 
which ambulance to dispatch to a call, the travel time for the ambulance to 
reach the call, and so forth. The effort we devote to this topic is justified by 
the great sensitivity of results to travel time assumptions, as noted both by 
the authors in a preliminary queueing analysis, and by a large proportion of 
the papers dealing with ambulance planning. For example, Carson and Batta 
[17] describe how the 30% savings predicted by their model turned into a 
6% savings in actual tests, primarily due to the model not effectively 
capturing a certain travel time/distance relationship. The use of a simpler 
model based on the “square root law” [18, 19] or other approximations leads 
to rather large errors due to the highly irregular geography of Auckland; it is 
basically an isthmus between two oceans, containing many dormant volcano 
vents. The complex waterways and vents provide significant barriers to 
travel, leading to a somewhat convoluted road network. A further 




82 



OPERATIONS RESEARCH AND HEALTH CARE 



complication is that travel times are heavily time-dependent. The simulation 
makes extensive use of the travel model, and we employ several heuristics to 
reduce the computational effort involved. Many of the techniques used here 
could be used in other applications requiring travel time calculations where 
the travel time is time-dependent. 

Third, we employ a geographic information system (GIS) to display 
simulation results and to examine historical performance calculated from 
real data. To our surprise, none of the ambulance service providers that we 
have talked with have used such tools in the past, and all have been 
tremendously excited by their potential. This has occurred in spite of the 
growing number of sites where a GIS is being used to draw insights from 
recorded data; see Peters and Hall [20]. Of course, GISs have been used 
many times to obtain input for simulation models (see, e.g., [21]), but GISs 
are not often used for displaying discrete-event simulation output. The 
graphical displays produced by GIS programs allow decision makers to 
digest copious amounts of information that were previously given in large 
tables. GIS output displays are currently under-utilized in discrete-event 
simulation studies, perhaps because of the form of the models involved. But 
as the ability to link discrete-event simulation software, databases, and 
standard GIS packages together increases, the use of GIS output display 
should become more prevalent. 

We have been contacted many times by individuals interested in applying 
BartSim methodology to planning problems in the other emergency 
services, namely fire and police departments. There are many potential 
applications to these areas from the work presented here, and we believe that 
such extensions could be tremendously helpful from the practical standpoint. 
However, it is important to recognize some of the vital differences in these 
problems from the ambulance-planning problem. These differences mean 
that substantial effort would be required to tailor the planning methods used 
here. For example, the utilization rates of fire appliances are typically on the 
order of a few percent, while it is not uncommon to have ambulance 
utilization rates, at least in metropolitan areas in New Zealand, as high as 
60%. In terms of police patrol planning, an important function of police 
patrols is to maintain police visibility, so the problems one faces can be quite 
different. 

The remainder of this chapter is organized as follows. In Section 4.2 we 
discuss some of the particulars of the St. Johns problem, and outline the 
process that is followed when St. Johns receives an emergency call. Section 
4.3 provides an overview of the simulation model underlying BartSim and 
describes some of the data-reuse issues alluded to above. Section 4.4 
describes the travel model and the heuristics used to reduce computational 
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overhead. In Section 4.5 we introduce BartSim itself, outline some of its 
GIS-based analysis capabilities, and describe how these analysis capabilities 
were used to provide useful insights into several decisions faced by St. 
Johns. Conclusions and suggestions for future research are offered in 
Section 4.6. 

Further details on BartSim can be found on the BartSim web site 
(www.esc.auckland.ac.nz/stjohn). 

4.2 THE PROBLEM FACED BY ST. JOHNS 

St. Johns contracts to Crown Health Enterprises to supply emergency 
medical transport. The contracts stipulate that St. Johns supplies a minimum 
level of service as specified by certain performance targets. These targets 
relate to response time, which is defined as the time interval between 
receiving a call to the time that an ambulance first arrives at the scene. The 
performance targets are broken down by the location of the call (whether the 
call is in metropolitan Auckland, or in a rural area) and the priority of the 
call. St. Johns classifies its emergency calls, as opposed to patient transfers 
and other non-emergency calls, into two levels. Priority 1 calls are those for 
which an ambulance should respond at all possible speed, including the use 
of lights and sirens. Priority 2 calls are calls for which an ambulance may 
respond at standard traffic speeds. The performance targets that St. Johns 
faces are shown in Table 4.1. 



Table 4.1 Contractual service targets 





Priority 1 Calls 


Priority 2 Calls 


Metropolitan 


80 % in 10 minutes 
95% in 20 minutes 


80% in 30 minutes 


Rural 


80% in 1 6 minutes 
95% in 30 minutes 


80% in 45 minutes 



It is interesting to note that no guidance is given in the contract as to how 
these figures need to be interpreted. Interpreting the targets as applying, for 
example, to the entire Auckland area over the entire year in aggregate will 
lead to far lower resource requirements than assuming, for example, that the 
targets must be met in each suburb during each hour of each day. One of the 
goals of this project has been to develop tools to assist management in 
exploring performance under a range of possible interpretations of the 
contract. 
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St. Johns uses a computer-aided dispatch (CAD) system that logs, in a 
database, information on every call that is received. The database then 
enables St. Johns to prepare monthly reports that describe how well they 
meet their performance targets. When St. Johns first contacted us, these 
reports indicated that the organization was finding it more and more difficult 
to meet its service targets. It was (and continues to be) believed that this is 
primarily due to increasing congestion on Auckland roads. 



To fully understand these service targets it is necessary to understand the 
ambulance dispatch and service delivery process. Figure 4.1 shows this 
process, and identifies the contractual response time discussed earlier. This 
flowchart also helps to explain the key steps that are captured within the 
simulation model. When a call arrives at St. Johns, staff in the control room 
identify an available ambulance (i.e., an ambulance either idle at its base 
station or returning from a previous job) and dispatch this vehicle to the 
scene. After initial treatment at the scene, the ambulance typically transports 
the patient to a hospital, performs a ‘handover’ to hospital staff, and then 
returns to its base station. If transport is not required, the ambulance returns 
directly to its base from the scene. In either case, the vehicle is considered 
available to receive calls as soon as it begins returning to base. 



Figure 4.1 The ambulance dispatch and service delivery process 



Ambulance waiting at station 



A: New 



Response 

Time 
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4.3 THE SIMULATION MODEL 

The simulation model is written using a high-level programming language 
without using specialist simulation software. The simulation is trace-driven, 
and ambulances are routed using a time-dependent travel model. Each of 
these aspects of the simulation is now discussed in more detail. 

We decided not to use an “off-the-shelf package for simulating St. Johns’ 
operations for several reasons. First, the logical complexity of the decisions 
that must be made within the model would be difficult to code in a standard 
package. For example, the dispatcher may redirect an ambulance that is 
responding to a Priority 2 call to a Priority 1 call. Such a decision requires 
detailed knowledge of travel times, ambulance locations and so forth. This 
decision is far easier to code using custom software in a high-level language 
(C) than standard simulation packages. The second reason was speed. The 
simulation must be very fast to facilitate the large number of what-if 
analyses that need to be performed. Consequently, we decided to code the 
simulation in C, and then embed the simulation program within a custom- 
developed Microsoft Visual C++ application to provide a user-friendly 
interface. Third, this approach has allowed us to tightly couple the 
simulation with specialized data visualization (GIS) tools, providing 
integration benefits that would have been hard to achieve using any of the 
off-the-shelf systems that were available at the time. (Since the software 
development was completed, simulation software has made great strides in 
allowing integration with database software and code segments written in 
other languages.) 

We were very lucky in that several years’ worth of historical data was made 
available to us. We used this data by running trace-driven simulations: the 
calls that we simulate are real calls that are read in from a stored file. See p. 
133 of Bratley, Fox and Schrage [22] for a discussion of issues relating to 
the direct reuse of historical data from the general perspective of discrete- 
event simulation. We confine our remarks to specifics related to the 
ambulance-planning problem. 

The data used from each call are call arrival time, call priority, call location, 
time spent by an ambulance at the scene, destination to which the patient 
was transported (if any) and time spent at the destination. The use of this 
historical data obviates the need to develop a statistical model for generating 
calls. This is a decisive advantage, as the correlation structure of calls, both 
temporally and spatially, is rather complex; see, for example, Lubicz and 
Mielczarek [10]. For example, the location of a call is somewhat correlated 
with the time of day at which it is received. 
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Of course, if we were to use BartSim for long-range planning (say more 
than two years into the future), we might be more wary about using 
historical data, because the existing data may not be representative of 
conditions in the future. In such a case, one might want to use an approach 
similar to that used in the development of the United Network for Organ 
Sharing Liver Allocation Model [23]. That model uses non-homogeneous 
Poisson processes to generate “arrival times”; other information about the 
“arrival” is obtained through a bootstrapping procedure. 

An area of concern that arises in using historical data in this fashion is data 
validity. Indeed, many of the logged calls contain entries that are difficult to 
believe. For example, it is not uncommon to see durations of 1 second for 
the time spent at the scene of an incident. Discussions with ambulance 
personnel revealed that this can occur when personnel forget to notify the 
CAD system (through a button situated on the dashboard of an ambulance) 
that they have arrived at the scene. When they realize their error, they 
“catch up” by pushing the button multiple times. This sort of error not only 
corrupts the recorded time spent at the scene, but also any surrounding times, 
such as travel times, that are used elsewhere. Identifying such errors and 
devising methods for dealing with them are important research areas that we 
have not explored. Instead, we adopted an ad-hoc procedure where the data 
for a particular call is “cleaned” if it is “close” to being reasonable, or the 
call is deleted if the logged data is beyond repair. Of course, if too many 
calls require cleaning or deletion then we should be concerned, and this is 
the reason why more research is required in this area. Fortunately, in the St. 
Johns application such calls appear to occupy a very small percentage of the 
total calls processed, so they cannot greatly sway the overall results. 

The use of trace-driven simulation allows one to deal effectively with many 
other issues, such as that of multiple-response calls. Multiple response calls 
occur, for example, because the personnel who initially respond are not 
legally qualified to administer needed drugs, or because the number of 
injured parties is large. Each response to a multiple response call is logged 
in the St. Johns CAD database and linked to previous entries. Within our 
simulation we simply replay these calls. This very simple approach could 
lead to potential errors when the personnel that initially respond in the 
simulation are qualified to assist the patient, so that further ambulance 
responses are not necessary. A more sophisticated simulation approach 
might avoid such errors by carefully analyzing the data record, but we did 
not do this. In any case the number of such multiple response calls is quite 
small. 

Ambulance availability is specified in terms of when and where an 
ambulance is to be brought into operation, and when it is to be removed 




AMBULANCE SERVICE PLANNING 



87 



from circulation. This allows shifts to be effectively captured, along with 
(for example) meal breaks that must be held at the ambulance’s base and 
have a certain minimum duration. 

A vital component of the simulation is a travel time model that computes 
travel times between any pair of locations in Auckland at any time. An 
important step in this project has been to establish collaborative links 
between St. Johns and the Auckland Regional Council, a local government 
body actively involved in developing strategic policy for the city of 
Auckland. The Auckland Regional Council made available a road network 
model that details both road layout information and travel times along roads 
(arcs) at various times of the day, including the morning and evening rush 
periods. The use of this data in BartSim is discussed in more detail in the 
next section. 

It is possible to run the simulation and see ambulance operations unfolding 
on the screen. In particular, one sees ambulances traveling along the road 
network to and from calls. As calls arrive, they are plotted on the screen in a 
color indicating their priority. As calls are assigned to an ambulance, the 
calls change color, indicating that they are being served. This animation is 
extremely useful for verification and validation purposes, and for visualizing 
St. Johns’ operations. It is also tremendously helpful in getting St. Johns 
personnel to accept the simulation model as a reasonable reflection of 
reality, and has proven invaluable in communicating our work to staff and 
management throughout the organization. This aspect of the simulation may 
seem somewhat trivial from a theoretical point of view, but has been 
absolutely critical in obtaining “buy in” from the decision makers. We view 
this selling point as a key advantage of simulation over other operations 
research methodologies for the ambulance-planning problem. The BartSim 
approach is intuitive and easy to understand for people with non-technical 
backgrounds. 

When one wishes to collect performance measures, the animation is an 
unnecessary computational overhead. In this case, animation is turned off, 
and the simulation proceeds without graphical feedback. We do not report 
confidence intervals for our performance measures. This is mostly due to 
the fact that the theory of error estimation from trace-driven simulations is 
not well understood, so that it is not clear how to develop confidence 
intervals. This is an area where more research could certainly help. 

A simulation model on the scale of BartSim requires a great deal of effort 
in verification and validation to ensure that the model that has been 
implemented is indeed what was desired, and that the model appropriately 
represents reality. Instead of entering into a full discussion of our efforts in 
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this regard, which are mostly direct applications of the usual methods as 
outlined in Law and Kelton [24], we content ourselves with a few examples. 

The animation facilities of BartSim proved invaluable in verifying the 
model. By watching simulated ambulance operations over extended periods, 
many errors in the database of real calls were identified. As well as 
replaying existing calls, BartSim also has a facility for interactively 
generating calls. This was used to place calls at strategic locations for 
checking that the ambulance responses were as expected. Shortest paths 
were generated and displayed over the road network to verify the quality of 
the chosen routes. 

The validation of a model involves ensuring that the model appropriately 
represents reality. In this regard, we worked very closely with a number of 
individuals at St. Johns. These people were closely involved in the 
development phase, and also assisted in performing test runs. Furthermore, 
we demonstrated the software and described the simulation model to groups 
of ambulance drivers, who provided feedback on the quality of the model. 
These steps also helped in the accreditation of the model, where the model is 
accepted and trusted by decision makers. The decision makers were so 
closely involved in the development and testing of the model that they felt 
some form of “ownership” over the system. 

4.4 THE TRAVEL TIME MODEL 

Auckland is built around two large harbors between two coastlines, and is 
dotted with dormant volcano vents. Consequendy it has a highly irregular 
topology. Any plausible simulation of road travel cannot rely on ‘as the 
crow flies’ routes, or simple modifications of these to take into account a 
moderate number of obstacles, but must incorporate knowledge of the road 
network including the effects of motorways and major highways. 
Furthermore, the model must also incorporate the often dramatic changes in 
travel times that arise from varying congestion levels across the day and the 
week. 

We obtained road data from the Auckland Regional Council detailing a 
network with about 2,200 nodes and 5,000 directed arcs. This Auckland 
Regional Transport Model (ART) is a relatively detailed transport model 
developed for medium term (15-25 years) project and policy planning and 
evaluation of regional transport strategy [16]. Traffic volumes are 
determined in ART using equilibrium solutions driven by origin-destination 
trip demands. Because the trip demands are determined using an underlying 
demographic model, travel times can be predicted over any planning horizon 
for which population forecasts are available. This ability to perform long- 
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term planning is most useful when evaluating strategic decisions such as the 
location of ambulance bases. 

We denote the ART road network by G = (V, A), where V is the set of nodes, 
and A is the set of directed arcs (/', j) from node ie V to je V. By entering trip 
demands for different times of the day, a range of equilibrium solutions can 
be found, each with different travel times for the arcs. The ARC data 

o 

includes the 8 a.m. morning peak travel time t J9 12 p.m. midday travel time 
ty , and 5 p.m. evening peak travel time t l J for each arc (i,j). Weighted 
combinations of these times are used to estimate the travel time ty during 

any other hour h of the day. The weights are chosen using regression models 
based on actual travel times available in the St. Johns database. 

We could use this model to compute dynamic shortest paths for ambulances 
based on time-dependent travel times whenever the simulation requires such 
paths. However, this would be a time-consuming computation that would 
gready slow down the simulation. As a reasonable approximation, we 
instead pre-compute and store a range of shortest paths as follows. Of the 
2,200 nodes in the network, 1,435 are used to spatially locate bends in the 
roads, while 765 are ‘decision nodes’ that define points at which a driver has 
a choice of direction (ignoring U-turn options). More formally, a node j 
belongs to the set D of decision nodes, je D y if there exists both an arc (/, 
j)eA and two distinct arcs from j, (/, k\)e A, (j, k 2 )^A with k\ * y'and k 2 *j. 

For each pair of decision nodes ie D and je D, we pre-compute three shortest 
paths, Py , Py 2 and Py 1 using the morning peak, midday and evening peak 

travel times respectively. This decision-node path information is stored in 
memory. 

During the simulation we need to find the shortest path S—*F between any 
arbitrary start point S and arbitrary finish point F. The shortest path process 
we use is heuristic, but nevertheless appears to provide a good level of 
accuracy. 

We note that S and F need not correspond to nodes in the network. The first 
step in our process is to determine the spatially closest non-motorway nodes, 
se V and fe V t to S and F, respectively. We next determine the sets of 
decision nodes, D(s)qD and D(J)qD, that are ‘immediately connected’ to s 
and/. The set of decision nodes D(s) is given by D(s)-T S CF) 9 where T s qG is 
a tree with root s and with branches each constructed by adding ‘outward 
pointing’ arcs until the first decision node is reached. More formally, T s is 
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initialized with root /,={$}, and then T s is grown by iteratively adding each 
arc/node pair {(*,_/), 7} : ie T S \D, (i,j)eA. Similarly D(f) is determined from 
D(f)=Tf\D, where 7} is a tree built at /by adding all ‘inward pointing arcs’, 
i.e., adding each arc/node pair 

{UJ)J}:jeTfD, (iJ)eA. 

We then consider all the paths given by 

P={S->s->d s 4 dj->f->F\ d s e D(s) 9 d/e D(J) y he {8, 1 2, 1 7} }, 

where S—>s (and /— >F) denotes ‘as the crow flies’ travel from S to s (and / to 
F), s-)d s denotes the unique path from s to d s in T s , 

d, d f 

denotes the pre-computed shortest path from decision node d s to decision 
node df at hour h, he { 8, 12, 17}, andofy— ■►/ denotes the unique path from df to 
/ in Tf. Each of these paths is then evaluated using the interpolated travel 
times for the hour in which the journey begins. The S~^s and /— >F travel is 
at some assumed off-network speed. The fastest of these paths is deemed 
the shortest path. 

The decision node concept provides two primary benefits. First, without the 
use of this concept, we would need to solve an ‘all shortest paths’ problem 
on 2,200 nodes for each of the three sets of travel times. An ‘all shortest 
paths’ problem on n nodes can be solved using the Floyd-Warshall algorithm 
in 0(n) time (Papadimitriou and Steiglitz [25], p. 133). With the decision 
node concept, we solve an ‘all shortest paths’ problem on approximately one 
third (765) of the nodes, and therefore reduce the computational effort by a 
factor of 3 3 = 27. We also reduce the memory required to store the shortest 
path solutions by a factor of 3 2 =9. Second, we consider several paths 
involving different combinations of decision nodes when deciding which 
route to take between any origin and destination. This means that the chosen 
route is a compromise between a pre-solved single fixed route, and the true 
shortest path as would be determined by solving a dynamic shortest path 
problem while the simulation is running. 

When an ambulance responds to a Priority 1 call, it travels at ‘lights and 
sirens’ speed. We have captured this effect within the simulation using a 
multiplicative factor to decrease travel times from more standard travel 
speeds. This factor was fitted to data available in the database. We are 
currently exploring other improvements to the modeling of travel speeds. 
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4.5 BARTSIM 

BartSim consists of the simulation program, the travel model, and various 
analysis tools. The simulation and travel models have been outlined in 
previous sections. This section describes the analysis capabilities of 
BartSim. These capabilities may be applied to historical data as recorded 
by the St. Johns organization, as well as simulated data generated by the 
simulation component of BartSim. Informed comparisons can then be 
generated between alternative strategies for operating the ambulance service. 
These analysis capabilities have proven very useful in St. Johns’ decision 
making, several instances of which are mentioned below. 

To protect St. Johns’ confidentiality, all figures presented in this section are 
based on simulated data, rather than actual historical data. Road travel times 
have been perturbed, and all performance figures subjected to random 
perturbation. The number of ambulances operating out of each base has also 
been modified, with the result that we see a lower level of performance and 
greater variability over the Auckland region in terms of response time than is 
actually the case with historical data. 

We record the response time performance on every call, so that a call can be 
classified according to which performance targets have been met. These 
“micro-statistics” may be aggregated into response time performance within 
every suburb of Auckland, within every half hour of the week. When a run 
consists of multiple weeks of real data (the runs usually consist of several 
months of real data), then results in the same time period in different weeks 
are accumulated together. Statistics are also collected on ambulance 
utilization. 

By recording the response time performance on every call, we can generate 
plots such as that given in Figure 4.2. In Figure 4.2 a black dot indicates that 
a call was answered within the 80% time requirement, a gray dot means that 
the call was answered within the 95% time requirement, and a white dot 
indicates that neither of these response time bounds was met. (These colors 
have been modified from those used in the software to improve 
reproduction.) One can visually identify localized areas of poor 
performance. This is a very powerful capability that St. Johns have found 
extremely useful in allowing management to visually interpret data that was 
previously only available in aggregated database report tables. In particular, 
using these plots we were able to verify a belief held by some at the St. 
Johns organization that Silverdale (a suburb of Auckland) needed more 
resources, perhaps because of the strong recent growth in the region. A 
long-dormant station in Silverdale has since been reopened. 
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Figure 4.2 Response time performance in the Auckland region (data 

is illustrative only) 



Key: Response time... 
■ < 80% target time 
H < 95% target time 
□ > 95% target time 



Figure 4.3 Plot of the “reach” of Pitt St. Station during the late 
moming/early afternoon period on weekdays (data is illustrative only) 
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BartSim has proved to be a useful decision support tool for assisting with 
the allocation of ambulances to stations. During periods of low call demand, 
performance targets can be met by using just a few stations to cover the 
entire Auckland region. We can identify the “reach” of a station by 
producing plots like that of Figure 4.3. 

In this plot, we computed the travel time from a single station to all calls. 
By coloring the call locations as above, we obtain a vivid picture of the area 
that can be covered by positioning an ambulance at a given station. Since 
travel time varies dramatically with the time of day, we can obtain a clearer 
picture of the station’s reach at a given time by filtering the calls, so that we 
only display those arriving during a subset of the week. Figure 4.3 contains 
only those Priority 1 calls received in the late moming/early afternoon on 
weekdays. By repeating such plots for several stations, we can identify a 
suitable subset of stations that may be used to cover Auckland during 
various times. 

As mentioned above, we can filter the calls so that one can “zoom in” on a 
particular time, or a particular area of Auckland, or both. The performance 
measures for the time and area of interest are then calculated, allowing one 
to identify response time performance for centrally located calls, for 
example. A sample screenshot of such an analysis is given in Figure 4.4. 
The small window in the upper screen area contains detailed information on 
contractual target performance for a case where ambulance allocation is too 
light, so that the targets are not met. 

The plots described above are very useful for providing an overview of 
performance. In addition, plots such as those in Figure 4.4 allow one to 
provide precise numerical information on performance in a localised region. 
It is also desirable to be able to summarise on-time performance (relative to 
the contractual targets) over the entire Auckland region at once; Figure 4.5 is 
an example of such a plot. In this figure, the Auckland region has been 
broken down into rectangular regions. Within each region, we compute the 
percentage of Priority 1 calls reached within the required time limit (10 
minutes for urban calls, 16 minutes for rural calls). To allow one to focus on 
regions containing significant numbers of calls, regions containing a small 
number of calls are suppressed in the output. Furthermore, the size (area) of 
the rectangles reflects the number of calls received within the region. We 
can also substitute other performance measures, such as the number of calls 
received, or the percentage of Priority 2 calls reached within the required 
time limit, in place of the performance measure used in this example. 
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Figure 4.4 Filter applied to results to identify performance in the city 
centre (data is illustrative only) 




Figure 4.5 is perhaps the most useful of all the plots described thus far in 
terms of determining required ambulance allocations. We vary the 
ambulance allocations between bases (usually heuristically, but one could 
also use optimisation methods), run the simulation, and then observe the 
performance in terms of these plots. Using these plots, we can locate areas 
with both a poor overall on-time performance and a large number of calls. 
These areas are good candidates for extra ambulance resources. 
Furthermore, by filtering the calls by time and producing the same plots, we 
can identify times when extra ambulances are most likely to have a large 
impact on the performance measures. 

These plots revealed something unexpected when applied to historical data 
for the St. Johns organisation. In one small suburban area (not shown), a 
disproportionate (relative to neighbouring areas) number of calls were 
appearing. Upon investigation it was discovered that there are several 
accident and emergency clinics in this area, and such clinics generate many 
calls for St. Johns. The St. Johns organisation was apparently unaware of 
this situation, and is considering our recommendation that they ensure that 
an ambulance be relocated close to this vicinity. 



BartSim can also produce simple histograms of various characteristics of 
calls, such as response time, time spent by an ambulance at the scene, and so 
forth. One such histogram is given in Figure 4.6, showing the time between 
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Figure 4.5 Plot of average service quality (indicated by the numerical 
values) and the number of calls (indicated by the size of the white 
squares) for grid areas in Auckland (data is illustrative only) 




a call being received and an ambulance being dispatched for a set of 
simulated metropolitan Priority 1 calls. The histogram shows very clearly 
that for many of the calls, a large amount of time is spent before an 
ambulance is dispatched to a call. Time spent in the dispatch process 
reduces the amount of time that an ambulance has to reach the scene of a 
callout if it is to meet the contractual performance targets. A plot similar to 
this for the historical data recorded by St. Johns was one of our most 
important findings for the organisation. Small decreases in these dispatch 
times can have (as simulations quantified) a large impact on contractual 
performance, so that it is worth devoting considerable effort to determining 
ways in which the dispatch time can be reduced. Apparent inefficiencies in 
the dispatch process can, when considered in view of the overall goals of the 
organization, actually be viewed as efficiencies, especially when the 
alternative expense of additional ambulance units is considered. 
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Figure 4.6 Distribution of the interval (in minutes) between a call 
being received and an ambulance responding (by radio) that it is en 
route (distribution is illustrative only) 




BartSim also produces statistics on ambulance utilisation. These statistics 
may be imported into a spreadsheet (we use Microsoft Excel), and analysed 
from there. An example of the type of graphs that can be produced is given 
in Figure 4.7. This graph depicts the underlying demand near one of the 
stations operated by St. Johns. Each row of bars reflects the performance 
that can be expected over the week when a given number of ambulances are 
stationed at the base. In particular, each individual bar reflects, for a given 
number of ambulances and time of the week, the percentage of time that no 
ambulance is available to respond to incoming calls. This information is 
extremely useful for getting a first approximation to the number of 
ambulances required at each individual base at different times of the week. 
Of course, one would cover some proportion of these calls from other 
stations, but the plot gives an impression of the underlying demand. 
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Figure 4.7 Ambulance utilisation/requirements at one station (data is 

illustrative only) 

Ambulance Requirements at one Station ■ 1 * nb 
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As a final example of the nontraditional uses of BartSim, we mention that 
at a certain stage St. Johns was considering the use of a dispatching strategy 
that was expected to have a number of effects. First, it would better match 
the skills of the staff with the patient’s requirements at the scene, thus 
resulting in better care. Second, it would result in fewer Priority 1 
dispatches being made because the improved data collection would allow 
more cases to be classified as Priority 2. Priority 2 cases have a longer target 
response time so the performance targets for these cases would appear to be 
easier to meet. However, vehicles on Priority 2 dispatches do not use lights 
and sirens, so the time a vehicle spends on a case increases if it is changed 
from Priority 1 to Priority 2. The improved case classification would come 
at the cost of increased dispatch times. These changes were built into the 
simulation using approximations for the extent of the effects, and then 
comparisons between the current and proposed system were drawn based on 
the plots discussed in this section. The analysis played a large role in 
determining whether the proposed system would be adopted. 

4.6 CONCLUSIONS 

BartSim has been used to evaluate several decisions considered by St. 
Johns, including the use of a dedicated non-emergency patient transfer 
service, the possible introduction of a new dispatching method, and changes 
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to where and when ambulances should be allocated. The results of these 
studies have been used to shape policy at St. Johns, and we continue to work 
with St. Johns on these and other issues, including rostering requirements for 
their staff. This experience has convinced us that simulation is a powerful 
tool in emergency service planning that is currently underutilized. Good 
simulation visualization tools have proven invaluable as a communication 
tool for describing our work to management and staff of St. Johns. The 
spatial data visualization capabilities have provided management with a 
significantly improved understanding of their current performance and, in 
conjunction with the simulation model, allowed results from what-if 
analyses to be readily communicated and understood. 

It is important in vehicle simulation models to accurately capture travel time 
information. We have developed heuristics that allow both accurate 
modeling of travel times and rapid simulation run times. In addition, we 
introduced the notion of a decision node, which dramatically decreases the 
time required to compute shortest paths in the networks. This concept may 
be of interest in other applications where shortest paths must be calculated in 
large networks. 

The travel times predicted by our model are deterministic: the same time is 
always predicted for travel from one point to another at a given time on a 
given day. However, travel times can vary tremendously depending on 
unpredictable events such as traffic congestion, weather, and traffic 
accidents. It is our belief, based on some initial analysis with very simple 
models, that randomness in travel times can have a material effect on the 
predictions of a model, and this is an area that we are beginning to 
investigate. Some care is needed, as it is not immediately clear how to 
generate random travel times. In general, there will be “macro” effects, such 
as those described above, which affect many ambulance trips in the same 
way, whereas other “micro” effects, such as traffic light phasing, might be 
confined to a single ambulance trip. 

The combined simulation and data visualization tools introduced here have 
been of tremendous help to St. Johns, and several other ambulance 
companies have expressed interest in using the system within their 
organization. In our experience, the combination of CAD databases, CIS 
visualization methods and simulation leads to more informed decision 
making, and better utilization of resources, than the previous state of the art 
has supplied. 

Since preparing this chapter, BartSim has been selected in a competitive 
tendering process for use in Melbourne, one of the larger cities in Australia. 
As part of this work, BartSim has evolved into a more powerful system 
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known as SIREN (Simulation for Improving Response times in Emergency 
Networks) (see http://www.optimal-decision.com). Enhancements include 
call generation using non-homogeneous Poisson processes, introduction of 
stochastic travel times, more detailed case classifications, and more 
sophisticated simulation logic to handle the increased operational complexity 
of this new problem. For example. Siren can dispatch several vehicles to a 
call, one of which is left at the scene while the ambulance officers travel in 
the other vehicle to the hospital. Upon leaving the hospital, this vehicle then 
travels back to the scene where the officers return to their original vehicles. 
The transport model has also been enhanced to reduce the memory 
requirements of the pre-computed shortest paths, allowing a network with 
6,000 nodes and 14,000 arcs to be handled. This network also allows 
shortest distance (in addition to fastest time) routes to be calculated, and 
includes arc-specific times for lights and sirens travel. It is pleasing to see 
the value that Siren can add being recognized by another ambulance 
organization. 
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SUMMARY 

The chapter starts with a strategic overview of the blood banking supply 
chain. We then proceed to ask and answer questions concerning (i) the 
blood banking functions that should be performed and at what locations, (ii) 
which donor areas and transfusion services should be assigned to which 
community blood centers, (iii) how many community blood centers should 
be in a region, (iv) where they should be located and (v) how supply and 
demand should be coordinated. Then the many tactical operational issues 
involved in collecting blood, producing multiple products, setting and 
controlling inventory levels, allocating blood to hospitals, delivery to 
multiple sites, and making optimal decisions about issuing, crossmatching, 
and crossmatch releasing blood and blood products are presented. The 
chapter concludes with areas for future research. 
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5.1 INTRODUCTION 

As a supply chain, the flow of blood and blood products from the donor to 
the patient would seem to be one of the simplest inventory and distribution 
problems in the supply chain literature. Perhaps it is. One merely collects 
whole blood from donors, processes it into its components at a regional 
blood center or a community blood center and delivers the components to 
hospitals where they are transfused into patients. Geographically, the 
situation is shown in Figure 5.1. 

Figure 5.1 A geographic region for blood supply and demand 




In a geographic region, a regional blood center (RBC) with satellite 
community blood centers (CBCs) or, in smaller regions just the regional 
center without satellites, will be responsible for providing a supply of blood 
products (components) to hospitals for patients. To do this, a schedule of 
donor drawing locations is made some months in advance. Donors are 
solicited to give blood at the locations as the drawing time nears. Mobile 
phlebotomy vans with medical and service personnel and equipment are sent 
to the sites on the scheduled days. Decisions are made to prepare various 
components from the whole blood so the appropriate bags are used when 
drawing the blood. The drawn whole blood is returned to a processing 
location where it is recorded, tested for viruses and diseases, and the 
components are prepared. The resulting components are then inventoried 
and appropriate shipments are made to the hospitals based on their inventory 
needs. The hospital staffs then make decisions on how and when to use the 
blood components. If a particular blood component exceeds its allowable 
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age it cannot be used for transfusion to a patient and must either be discarded 
or, for a few products, some modest salvage is possible. Some components, 
such as platelets, can be obtained direcdy from a donor by a process called 
pheresis. In this process a donor is connected to a machine that continuously 
circulates the donor's blood through the machine. The desired component is 
extracted from his/her blood and the remaining blood is returned to the 
donor. The process is costlier than the extraction of platelets from donated 
whole blood. 

What makes this problem interesting and/or difficult from a research 
perspective? First, blood is a perishable commodity and whole blood has 
many components, each of which has a different shelf life before it perishes. 
The preparation of different components involves significant costs. Second, 
the supply of whole blood at a donor drawing location is a random variable 
that often has a large variance and, for planning purposes, the donor drawing 
locations and drawing dates are themselves sometimes random variables 
(Figure 5.1). The supply is also impacted by the need to screen out a 
growing list of viruses and diseases before the blood and its components 
may be used for transfusions; more variability and more risks are introduced. 
Third, the demands for blood components at a hospital in both their amounts 
and frequency are random variables (Figure 5.2). Fourth, many interacting 
decisions must be made at the strategic design, strategic policies, and 
operational and tactical levels. All are affected by the need to control costs, 
to minimize outdating and waste and, above all, to control potential 
shortages. Fifth, the entire blood supply chain can be examined as an 
essentially whole system and not just a subsystem of some larger system as 
occurs in most other supply chains. And finally, from a research 
perspective, much technically interesting, generalizable theoretical research 
can be extracted from the real problem regarding perishable inventories and 
regarding disease testing. In the future research section, other interesting 
unresolved theory questions will be raised. 

Figure 5.2 shows the daily number of whole blood units drawn by the 
community blood centers in the Chicago area for one year. The drawing 
amounts range from zero to over 1,100 units and the variation is very large. 
It can be seen that in the January and November-December periods and in 
the summer the numbers of units drawn are below average and indeed there 
are often critically low inventories for patient needs. 

0+ blood is one of the most common blood types. Figure 5.3 shows the 
range of daily demands for patient needs from a low of less than 10 units to a 
high of over 140 units and with significant daily variation throughout the 
year. This variability is typical for all blood types at large and small general 
hospitals that treat both emergent and elective admission patients. 
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Figure 5.2 Daily phlebotomy drawings by the CBCs in the Chicago 

area for one year 




Figure 5.3 Daily 0+ crossmatches for a large Chicago hospital for 

one year 




The basic supply chain for whole blood and its components is given in 
Figures 5.4 - 5.7. 

The organizational structure and the geographic region for a CBC or a RBC 
has usually evolved as the region’s system of hospitals has grown and 
changed (Figure 5.4). In the early years as blood and components came into 
therapeutic use, hospitals began to draw blood and make components them- 
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Figure 5.4 Hierarchical regional structure 




selves. Today many very large metropolitan hospitals still do so for a 
significant amount of their supply needs. But, in general, as the demands 
grew, hospitals found it to be more cost effective to seek a dependable 
central source for blood and components and also for the latest knowledge 
and research. As this growth occurred, CBCs evolved to meet the needs of 
groups of hospitals. In some regions only one CBC became dominant and 
met the needs of the region, whereas in other regions several CBCs 
successfully met the region’s needs. In all cases, the intent of the hospitals 
was to obtain a dependable supply at minimal cost. For dependability this 
supply also had to be of the highest quality (free of blood borne diseases and 
meeting the best standards for therapeutic use) and always available when 
needed (no shortages). Because a significant part of the cost is recruiting 
donors and drawing, processing, storing, documenting and transporting 
blood, in order to minimize costs it was also necessary to minimize 
outdating and waste. 

Depending upon the various levels of demand and the geographic location of 
a hospital, the CBC will make regular shipments of whole blood and 
components on a twice daily, daily, biweekly or weekly basis to the hospital. 
In a metropolitan area, most regular shipments would be on a daily basis. 
For outlying rural areas, the shipments may only be weekly. 

In this process of inventory and distribution management the CBC must 
decide: 

1. its own optimal inventory levels to maintain, 

2. its inventory allocation policy in the event demands from the Hospital 
Blood Banks (HBBs) exceed the CBC inventories, 
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3. a trans-shipment policy from some HBBs to others in the event that 
there is an overall system shortage but some HBBs have a greater risk of 
shortages than others, and 

4. a recycle policy of bringing old but still useful blood at an HBB back to 
the CBC for use at other HBBs with higher levels of demands and higher 
probability of using that blood before its expiration (Figure 5.5). 

The HBB, itself, must decide its own optimal inventory levels to maintain. 
Depending on the corporate or contractual relationship between the HBB 
and the CBC, these levels may be made independendy of or in conjunction 
with the CBC. More will be said about these optimal inventory levels later 
in the chapter. 



Figure 5.5 The regional supply chain 




In most cases, the demands for red cells and for the various blood 
components are independent random variables. However, since the red cells 
and components come from the same source - donors - there is a high level 
of dependence created on the supply side (Figure 5.6). Furthermore, the 
process of collecting the whole blood in appropriate types of bags and then 
making, storing and distributing the components can be costly, depending 
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Figure 5.6 Processing whole blood into components 
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on the numbers and types of components made. Finally the components have 
different shelf lives, further complicating the supply chain processes. 

In addition to its optimal inventory levels, the HBB must decide its issuing 
policy (usually last-in-first-out (LIFO) or first-in-first-out (FIFO)), its cross- 
match demand and its cross-match release policy (Figure 5.7). Cross- 
matching is the process of testing for incompatibilities between the patient’s 
blood and the donated blood that the patient could potentially receive. The 
cross-match demand policy is the number of units of blood or a component 
that should be cross-matched to a patient’s blood and assigned to that patient 
prior to its use. For whole blood and packed red cells (PRCs), the number of 
units will often be about one to two standard deviations above the average 
needed for the procedure. The cross-match release policy is the number of 
days after the patient’s procedure that the unused blood or components will 
stay assigned to the patient in the event of emergency needs due to 
complications. For whole blood and PRCs this is often one or two days. 
Obviously the units continue to lose shelf life while on cross-match to that 
patient. Once released from this assignment, the units return to inventory 
(older) and can be cross-matched for use by another patient when needed. 
The reason for the assigned inventory is that its takes time to do the cross- 
matches and in an emergent situation the patient will not be able to wait. 
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Figure 5.7 Hospital i’s inventory process and decisions 




5.2 LITERATURE REVIEW 

Research on the regional and the local management aspects of the blood 
supply essentially started in the 1960s, peaked in the late 1970s and early 
1980s and then dropped off significantly to the present time. Excellent 
reviews of the work to the mid-1980s can be found in Prastacos [1], 
especially with regard to blood bank management policies and decisions, 
and in Nahmias [2] regarding theories of perishable inventories. In the years 
since these two reviews were published, almost every OR/MS researcher has 
left this area of research to pursue other interests. To some extent this 
exodus was caused by the collapse of federal funding for studies in the area 
(which reduces support for MS and PHD students), in the increasing 
difficulty of the remaining problems in the area, and in the shift of emphasis 
to do research on blood supply safety. 

Since the mid-1980s, the published management-oriented work in blood 
banking has mostly been in the development of information systems to 
support donor screening, inventory management, blood ordering, blood 
usage review and compatibility testing [3]. Indeed, much prior work on 
information systems (IS) has been reported in earlier decades, but the advent 
of the personal computer (PC) and PC networks has driven the development 
of new structures and uses for blood information. In addition to improved 
IS, more new technologies are being introduced to improve the logistics and 
safety of the blood supply and delivery [4]. The National Blood Service, 
which is the central blood service for the United Kingdom (in a sense a super 
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RBC), is considering introducing electronic tags for every blood bag with 
sensors that tell all donor-specific details, blood type, and the exact location 
and temperature in real time for that bag. Clearly such a technology will 
greatly increase the safety and quality of the blood supply as well as provide 
logistics data for optimal supply chain management. 

Few studies of note in the application of OR/MS have appeared in the past 
two decades. A platelet inventory management model was developed to 
determine outdate and shortage rates as a function of base stock levels and 
mean daily demand [5]. Using simulation, the model provided the base 
stock levels for different mean daily demand such that the platelet outdates 
and shortages in a region were significantly reduced. In another study, the 
task of scheduling donors at a bloodmobile site was undertaken [6, 7J. This 
modeling involved issues of donor motivation and psychology, layout of the 
collection facility and managing serial and parallel queues. Using a 
simulation model, the authors were able to improve the registration, 
screening and phlebotomy processes, which in turn improved donor 
satisfaction and reduced donor balking and reneging in future blood drives. 
The employers at the sites that the bloodmobile visited were also better 
satisfied because the new layouts and scheduling reduced employee waiting 
times to donate. 

5.3 THE REGIONAL BLOOD BANKING SYSTEM 

5.3. 1 Regional structures and economies of scale 

A strategic question regarding the regional supply chain for blood and 
components is: what are the economic and organizational consequences of 
different forms of regionalization of blood banking services? 

If regionalization is to be effective, it must make a positive contribution to 
the achievement of one or more of the following objectives: reducing costs, 
reducing shortages and outdates, reducing extra-regional dependencies, 
improving the quality of the products, and reducing the confusion of 
overlapping jurisdictions. A search for economies of scale was thought to be 
the most logical starting point to analyze these factors. It is already known 
that by well planned operations, regionalization can reduce shortages and 
outdates by smoothing the region-wide supply and demand fluctuations (law 
of large numbers); however, issues of improved cost only will occur if there 
are economies of scale in regional operations. 

The regional structures of interest are embedded in Figure 5.4 and illustrated 
in Figure 5.8. Level 1 is the Regional Blood Center, Level 2 is the 
Community Blood Centers and Level 3 is the Hospital Blood Banks (HBBs) 
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and other Transfusion Service (TS) locations such as clinics and surgi- 
centers. (For ease of writing we will consider TSs as little HBBs and not use 
the TS notation.) In a region, if Level 1 does not exist, it means all HBBs 
are served by two or more CBCs only. In a region, if Level 2 does not exist, 
then a single RBC serves all HBBs (effectively operating as the sole CBC). 

Figure 5.8 illustrates the different regional structures of interest. These are 
the single community blood center for the entire region, a collection of 
independent CBCs for the region or a collection of CBCs controlled or 
coordinated by an RBC. As a general rule, as the size of a region changes 
due to an increase in demand or geographic reach, the single community 
blood center may not adequately Fill the needs of the region, and one of the 
other two structures will tend to replace it over time [8, 9]. 



Figure 5.8 Different organizational structures for a region 
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In order to identify the economic and organizational consequences of 
different forms of regionalization of blood banking services, data were 
gathered from seven Chicago community blood centers and 66 Chicago area 
hospitals, as well as five other regional blood centers from around the nation. 

Because wage rates, depreciation, purchasing costs of goods and supplies, 
rent, utilities and other costs vary greatly from one region to another, a proxy 
for costs was used. Instead of dollar costs for the geographic regions, 
man-hours per unit were used to derive the production function for each 
functional area of blood banking and for combinations of functional areas. 
The functional areas of main interest are: 



(i) donor services (recruitment of donors and donor organizations). 
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(ii) phlebotomy on mobile units (collection and transport of the whole 
blood from donor locations), 

(iii) phlebotomy at the community center, 

(iv) processing (testing, typing and component preparation), 

(v) inventory and distribution (storing and transport to hospitals), and 

(vi) administration. 

The overall total costs were also analyzed for scale economies. 

The choice of man-hours removes the need to adjust dollar costs for the 
different wage rates experienced throughout the country. Great variation in 
man-hours per unit occurs across centers. Some of this variation is due to 
economies of scale or may result from different geographic distances 
covered, different proportional amounts of components produced, saturation 
of the donor market, style of management and expansion dislocations. 

To reduce some of the data variations in the workload at the different centers 
for collecting, processing and inventory and distribution activities due to 
different proportions of components, a study of the times required to make 
the different components was undertaken. Using the results of these time 
studies, time- weighted volumes of activity were defined for each blood bank 
function and each center. In this manner, it was possible to compare the 
workload activities at all the centers for each function. In mobile 
phlebotomy, the number of units used to measure the workload were the 
amounts of whole blood drawn on the mobiles. In donor recruiting, the units 
were the whole blood drawn at the blood center, satellites and mobiles. In 
processing, the units were the whole blood drawn, plus weighted handling 
and processing times for the other components based on a normalized weight 
of 1.0 for whole blood. In the inventory and distribution area, the units were 
the total units shipped including whole blood and components. In the 
administrative and total manpower areas, the units were the appropriate units 
for each of the functional areas weighted by the percentage of the staff in 
each of the areas. 

Because of the nature of the supply chain processes, some or all of the 
functions can be performed at Level 1 (the RBC) or at Level 2 (the CBCs). 
However, the process flow determines the order in which the functions are 
performed. Consequently there are only six possibilities for deciding which 
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Table 5.1 Options for regional operations where the blood banking 
functions are performed in different combinations at the RBC and 

CBC levels 



Option 


Function at RBC (Level 1) 


Function at CBC (Level 2) 


1 


None 


Inventory and Distribution, 
Processing, Phlebotomy at 
Center, Phlebotomy on 
Mobiles, Donor Services 


2 


Donor Services 


Inventory and Distribution, 
Processing, Phlebotomy at 
Center, Phlebotomy on 
Mobiles 


3 


Phlebotomy on Mobiles and 
Donor Services 


Inventory and Distribution, 
Processing, Phlebotomy at 
Center 


4 


Processing, Phlebotomy on 
Mobiles and Donor Services 


Inventory and Distribution, 
Phlebotomy at Center 


5 


Processing, Phlebotomy at 
Center, Phlebotomy on 
Mobiles and Donor Services 


Inventory and Distribution 


6 


Inventory and Distribution, 
Processing, Phlebotomy at 
Center, Phlebotomy on 
Mobiles and Donor Services 


None 



functional combinations can be performed at which levels as we analyze 
what is the best organizational structure for regional blood banking. 

The combinations of functional areas are the options designated 1 to 6 in 
Table 5.1. For example, Option 5 reflects all tasks except inventory and 
distribution to be performed at Level 1 (the RBC); Option 2 reflects all tasks 
except donor services to be performed at Level 2, and so on. Using this set 
of six options, it is possible to analyze the structures given in Figure 5.8. 

It was hypothesized that economies of scale exist in all options, with the 
possible exception of donor services. In donor services, as the geographic 
area expands and the donor market reaches a saturation level, it was 
hypothesized that increasingly more donor recruiter hours are needed to 
obtain the additional units of blood. 


























1 16 OPERATIONS RESEARCH AND HEALTH CARE 



For the individual functions it was found that (i) the economies of scale 
hypothesis was significant for inventory/distribution and (ii) economies of 
scale occur initially, later followed by constant returns to scale in 
phlebotomy at the center, mobile phlebotomy, administration and 
processing. Donor services seemed to exhibit diseconomies of scale. When 
all functions are performed in a single center, economies of scale exist 
initially and are significant [10, 1 1J. 

For the options that correspond to specific regional organizational structures 
ranging from totally centralized activities to totally decentralized activities 
(Options 1 and 3-6) there are economies of scale. In particular, at the lower 
volumes (10,000 red cell units annually), the economies of scale are very 
significant. From 50,000-75,000 units, economies of scale are not as 
dramatic. Above 75,000 units the curves tend to flatten out but still show 
some small economies of scale. Caution should be exercised in using the 
curves past 200,000 weighted units since they were derived with only four 
data points. Option 2 exhibited economies of scale at the CBCs but 
diseconomies of scale at the RBC because in this option the RBC provides 
only donor services. 

This analysis leads to two related conclusions. First, a regional system with 
community blood centers that are operating below 50,000 weighted units can 
realize significant economies of scale by increasing volume. These 
economies come from a more efficient utilization of space, equipment and 
vehicles, specialized skills and learning curve effects. Second, a regional 
system with one community blood center is more economical than a regional 
system with two CBCs, two are more economical than three, and so on. The 
example in Table 5.2 shows that none of these community blood centers 
should operate at less than 50,000 red cell units annually . Thus a region 
with slightly over 200,000 units annually is operated most economically 
with one center. The costs of two or three community blood centers (even if 
all are over 50,000 units) rise rapidly. This analysis leads to the conclusion 
that a region should have only one CBC (which would also be the RBC by 
definition). If a region needs more than one CBC due to geography and very 
large blood volumes, then the number of CBCs should be kept to a 
minimum. 

Using the economies of scale results, we can gain an understanding of the 
cost implications of various regional structures as a basis for planned change 
in a region when such change is warranted. We can determine [12]: 
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Table 5.2 Example of the man-years needed in a regional system 
drawing 234,000 red cell units annually using Option 1 



Number 

of 

Centers 


Annual 
Volume at 
Each Center 


Total 

Regional 

Man-years 


Net 

Difference 


Cumulative 

Difference 


1* 


234,000** 


302 






2 


117,000 


332 


30 


30 


3 


78,000 


349 


17 


47 


4 


58,500 


365 


16 


63 


5 


46,800 


380 


15 


78 


6 


39,000 


396 


16 


94 


7 


33,000 


415 


19 


113 


8 


29,250 


435 


20 


133 



* Option 1 when it has only one center is the same as option 6. 

** In the year of the study, the seven Chicago community blood centers handled 
234,000 units and used 416.5 man-years. However not all seven centers were of 
equal size as the example above has assumed. 



(i) the blood banking functions that should be performed and at what 
locations, 

(ii) which donor areas and transfusion services should be assigned to 
which community blood centers, 

(iii) how many community blood centers should be in a region, 

(iv) where they should be located, and 

(v) how supply and demand should be coordinated. 

The next step in the analysis of the regional supply chain is to use the cost 
analysis to construct a model and decision support system to find: the 
number and location of community blood centers, the allocation of hospital 
blood banks to each CBC, and the routing of delivery vehicles from the 
CBCs to their HBBs to minimize (regular shipping costs + emergency 














































1 1 8 OPERATIONS RESEARCH AND HEALTH CARE 



shipping costs + operating costs) subject to constraints on: capital 
availability, personnel and facilities, budget, quality assurance, system 
reliability and demands for blood components. We call this model the Blood 
Transportation- Allocation Problem (BTAP) [13]. BTAP is a large 
constrained integer nonlinear program. BTAP is solved by decomposing the 
model into two sub-models: a demand model and a supply model. The 
demand model finds the best locations of the CBCs, allocation of the HBBs 
to the CBCs and the routing of the delivery vehicles for the distribution of 
the blood components to minimize the total costs of routine and emergency 
deliveries plus the system costs of operations subject to constraints. The 
supply model finds the best allocation of the donor supply locations to the 
CBCs to minimize the supply-side transportation and recruiting costs subject 
to constraints. This supply model is a constrained transportation-type 
problem. 

The demand model takes as inputs the locations of HBBs in a region, the 
distances separating them, and their whole blood and component needs. 
Distances can be in any metric but for the analysis, the driving time between 
locations was used. The user then specifies the number of community blood 
centers to be evaluated (from 1 to 10) and their desired locations. In 
addition, the user specifies the desired option from Table 5.1 to be evaluated. 
Locations and options are then varied to achieve the most practical optimal 
locations and allocations. The results of one run of the model are shown in 
Figure 5.9. In this figure, the loops indicate which HBBs are assigned to 
which community blood center to meet the demands for blood at minimal 
cost. 

The supply model was developed to allocate the supplies of blood to each 
CBC. This model takes as inputs the allocation of transfusion services with 
their demands given in the previous model and the available supplies of 
blood in the metropolitan area by zip code areas and then assigns the 
supplies to the CBCs in such a way as to meet the demands in each center 
and minimize costs of collection. Figure 5.10 illustrates the results of the 
supply assignment model corresponding to Figure 5.9. 

Delivery Vehicle Routing. As part of a regional blood bank design model, 
the problem of vehicle routing for blood product deliveries was considered. 
The basic problem involves selecting vehicle routes for each central blood 
bank subsystem that minimize overall transportation costs between the CBC 
and its member HBBs. For each configuration of blood banks, a “sweep” 
algorithm was used. The algorithm is a heuristic method that is incorporated 
into the overall regional blood bank location and central bank allocation 
model (BTAP). Figure 5.11 indicates a typical regional design solution for 
the metropolitan Chicago area that also contains optimal vehicle routes. 
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Figure 5.9 Allocation of hospitals to three CBCs based on 
emergency and routine delivery costs 




In the preceding paragraphs it was concluded that some benefits of a 

regionally controlled structure would be: 

• smoothing of the supply of blood from donors and a reduction in 
competition for donors, 

• smoothing of the demands faced by a community blood center for blood 
and components by averaging the demands from many hospital blood 
banks, and 

• economies of scale by operating community blood centers at levels 
above 50,000 units annually. 

These benefits would lead to reduced shortages, outdating and costs. 
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Figure 5.1 0 Optimal donor mobile site allocations for three CBCs 




5.3.2 Tactical and operating decisions in a regional blood banking system 

Before proceeding to detailed discussions of the tactical and operating 
decisions for a CBC and for HBBs, it should be noted that all of the optimal 
decision rules concerning inventory amounts, cross-match release policies, 
issuing policies, trans-shipment policies, vehicle routes and other factors 
interact with one another. That is, if one changes the policy in one area, it 
could affect the policies being followed in the other areas. These interactions 
will become more apparent as we proceed through this section and more will 
be said about them in subsequent discussions. Since it is not possible to 
present all of these policies simultaneously, each will be presented separately 
and the reader should keep in mind that they all interact. Usually for smaller 
changes, the interactions are not significant, i.e. the decision rules are robust. 
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Figure 5.1 1 Daily delivery truck routes 
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5.3.3 Forecasting 

As in any business, the driving force for all decisions is the amount and 
types of demands that the business must meet to be successful. This fact is 
no different in blood bank management. Demands drive decisions. Any 
organization with random demand that does not do rational forecasting is 
condemned to work frequently in crisis and higher cost mode. Consequently 
it is necessary to have a good understanding of the demands, past, present 
and future. From our prior discussion, we know demand for the variety of 
blood products carried in inventory is a major source of uncertainty in the 
management of blood banks. Accurate forecasts of the quantity and timing 
of future demands become key inputs to inventory control and donor 
recruiting decision making. In particular, decisions relating to the quantities 
of blood products to be carried in stock, the scheduling of drawings from 
donor lists or mobile drawings, and ordering from other blood banks must all 
be made with such forecasts in mind. 

Demand for blood products can be computed by observing the number of 
those patients in a hospital who may require transfusions on any given day 
(cross-match requests) and the number of units requested for cross-match for 
each patient. Mean or average demand (cross-match quantity) then is simply 
the product of the mean number of requests times the mean number of units 
per request. 

In order to specify the probability distribution for the number of units of a 
specific category or type, Yen [14], building on the work of Elston and 
Pickrel [15, 16], demonstrated that it is sufficient to estimate two parameters, 
the mean number of patients per day requiring transfusion (dN) and the mean 
number of units requested for each patient (d R ). Moreover, the Neyman A 
distribution characterized by these two parameter values gave an adequate 
representation of the demand distribution obtained from data collected from 
a particular hospital. Subsequent analysis with regard to target blood bank 
inventory decisions [17-19], indicated that it was not necessary to keep track 
of these two components separately since effective system performance can 
be obtained by basing blood inventory decisions on mean demand alone (i.e., 
the product dN x dR). Thus, in order to control the blood inventory effectively, 
forecasts of mean daily demand must be generated. 

However, most blood inventory decisions are not reevaluated on a daily 
basis. In particular, target inventory levels probably would be updated on a 
monthly or quarterly basis taking seasonality into consideration. Figure 5.3 
illustrates such a target levels, computed on a quarterly basis. 
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In order to forecast monthly demand and to identify seasonal cycles, it is 
necessary to collect data for several years. Often in HBBs only aggregate 
data (summed over all blood types) are available for such an extended 
period. For many planning purposes such aggregate forecasts are sufficient. 
In those cases where forecasts specific by blood type are needed, a 
reasonable approach would be to forecast demand levels on the basis of the 
aggregate data and then use estimates of the distribution of demand over 
blood types as a means of disaggregating these estimates into blood type 
specific forecasts. We tested the validity of this approach by examining the 
standard deviation of blood type fractions over one year of observations 
(Table 5.2). These standard deviations were observed to be relatively small 
when compared to the mean for the more common blood types. Moreover, 
the demand fraction for these blood types also was symmetrically distributed 
about its mean value. The rare blood type fractions exhibited significant 
variation relative to their mean and their distribution tended to be skewed. 
Since the rare types do not influence the aggregate blood demand 
significantly, we may conclude that forecasting of aggregate demand and 
subsequent disaggregation is a reasonable approach to generating longer 
term blood type specific forecasts for the common blood types. Further 
analysis, however, is needed for the rare blood types. (Cohen et al. [20] 
show that equation (1) can be used with reasonable accuracy for all blood 
types.) 



Table 5.2 Transfusion requests as fraction of total demand (by blood 

type) 



Blood Type 


Mean Fraction 


Standard Deviation 


A+ 


0.3438 


0.1104 


A- 


0.0489 


0.0517 


o+ 


0.3804 


0.1081 


0- 


0.0517 


0.0447 


B+ 


0.1220 


0.0738 


B- 


0.0132 


0.0226 


AB+ 


0.0374 


0.0397 


AB- 


0.0026 


0.0076 



The best fit for the monthly demand series (12 years of monthly data from an 
HBB) using Box-Jenkins methodology is as follows: 
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D(t) = Z(t-l) - 0.812 E(t-1) + 0.259 E(t-I2) -0.211 E(t-13) (1) 

where 



D(t) = the forecast for month t 

Z(t) = the month t actual transfusion request level 

E(t) = the forecast error (Z(t)-D(t)). 

This forecast equation indicates that the moving average (error) term has 
cyclical components with periods of 1, 12 and 13 months. However, even for 
aggregate monthly figures there is significant variance and it is difficult to 
forecast monthly aggregate cross-match request quantities accurately at the 
single hospital blood bank level. We will later see that the optimal inventory 
order-up-to level is not very sensitive to the errors in prediction over a broad 
range around this optimal inventory level. 

5.3.4 Target inventory levels for an HBB 

As noted previously, the major responsibility of a hospital blood bank is to 
ensure that all blood-related demands are met in a manner that minimizes 
wastage through outdates and spoilage, maintains high quality standards and 
reduces shortages that require either emergency shipments from other blood 
banks, emergency demands on donors, appeals to the hospital staff for dona- 
tions or the delay of nonemergency and elective medical procedures. In 
order to achieve these goals, it is important for the hospitals (HBBs) to set 
inventory levels that trade off shortage versus outdate rates and minimize 
total operating costs. 

This section establishes a simple decision rule for an HBB which yields the 
optimal inventory level for each blood type for whole blood and red cells as 
a function of factors in the blood bank environment (the demand for blood 
by group and Rh, i.e. the blood type, and the ages of the blood units received 
from a CBC) and on the management decisions in the hospital itself (the 
inventory levels, the transfusion to crossmatch ratio, the crossmatch release 
period and the blood issuing policy - usually FIFO or LIFO). In using this 
simple decision rule, it is not necessary for the hospital blood bank 
administrator to choose a shortage rate for system operation since the 
inventory level recommended by the rule reflects the optimal tradeoff be- 
tween shortages and outdates [17-19]. 



Demand data from hospitals in the metropolitan Chicago area were used. In 
order to understand the complex interactions among the environmental, 
managerial and random variables affecting the hospital blood bank, a series 
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of models were constructed. The analysis began with a simulation model and 
a full factorial statistical design of the key variables to develop the response 
surface for various inventory levels' effects on shortages, outdates and costs. 
Then from the response surface a Cobb-Douglas model was used with log- 
linear regression to determine the optimal inventory order-up-to policy 
(optimal target inventory levels) for any whole blood/red cells Rh-blood 
type. This optimal inventory policy would apply to any hospital blood bank 
whose environmental and managerial data fell within the ranges of those 
variables used in the factorial design. These ranges were chosen to include 
most or all of the hospital blood banks in the United States. Using the 
optimal inventory policy, Cobb-Douglas models with log-linear regression 
were again developed to predict the resulting shortage and outdate levels 
under varying environmental and managerial decisions. These models used 
as input factors the system environment and hospital managerial decision 
variables. The factors considered include: parameters to specify the daily 
demand distribution, the age of units supplied from donors and/or the CBC, 
target inventory levels, the transfusion-to-cross-match ratio, cross-match 
release time, issuing policy, shortage cost, and outdate cost. 

Model outputs include detailed records of all inventory transactions and the 
age distributions of both assigned (cross-matched) and unassigned 
inventories. These outputs are used to estimate the “optimal decision rule" 
i.e., the relationship between the cost-minimizing target inventory level, S*, 
and the various factors. In a similar way, the outdate rate, O r , and shortage 
rate, S r , were determined by relating them to the various factors as well as 
the target inventory rule. 

The decision rule for the target inventory level is summarized in equation 

(2). 



,, 4.755(d m ) 06964 (p) 0l| - |6 (L) 0 ’ 1332 

b 0.0453 W 

where d m is the mean daily demand for a blood type, p is the average 
transfusion to cross-match ratio, L is the maximum shelf life for red cells 
(either 35 or 42 days) and R is the cross-match release time in days. All 
coefficients are significant at the 0.01 level or less and r 2 = 0.99. 

For each blood type, the blood bank manager computes the appropriate 
optimal inventory level, S*, and on a daily basis orders enough blood units 
to bring the available inventory on hand up to S*. If the blood bank receives 
deliveries only on a triweekly, biweekly or weekly basis then the d m value 
that should be used in the calculation should be the mean demand over the 
number of days between deliveries. 
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A range of 2 to 50 units demanded per day (which corresponds to an annual 
volume of between 300 to 10,000 transfusions) was considered in the 
experimental design. Almost all blood banks have type-specific mean 
demand volumes that fall into these ranges, and hence equation (2) has wide 
applicability. 

The small values of 0.1 146 for the power ofp, 0.1332 for the power of L and 
0.0453 for the power of R in equation (2) indicate that their influence on S* 
is not nearly as large as that of dm, with its power of 0.6964. Taken singly 
over the respective ranges of each variable, with the others held constant, the 
effect ofp, L or R on S* is, at most, 6 percent to 8 percent. 

For fixed values ofp, L and R, a positive exponent of 0.6964 for mean daily 
demand in the optimal decision rule indicates that as the mean daily demand 
increases, there is less than a proportional increase in the optimal order 
quantity. Alternatively, a blood bank that doubles its activity (in terms of 
mean daily demand) should increase its optimal inventory level by no more 
than 62 percent (provided that p and R remain the same). 

In a similar manner we can develop equations for the effects of the 
environmental factors and managerial decisions on the outdate rate and the 
shortage rate for a specific blood-Rh type at a hospital blood bank. 

The outdate rate is the ratio of the mean number of units outdated to the 
mean number of units transfused plus units outdated, and the shortage rate is 
the fraction of days on which a shortage occurs. In establishing the 
relationship between the outdate rate and its causal variables, it was evident 
that two additional explanatory causal variables should be the deviation of 
the hospital’s actual mean inventory level. S', from the optimal inventory 
level, S*, and the mean age of delivered units. A, from the CBC. If S' > S*, 
then outdates should increase because more blood is on hand than needed; if 
S' < S*, then outdates should decrease. In each case the reverse holds for the 
effect on shortages when S' differs from S*. 

We also can hypothesize the effect of the other causal variables such as the 
crossmatch release time, R, and the mean age of units, A. As either 
increases, outdates should increase. The reverse should hold for the variables 
d m , p and L. That is, the larger the mean demand, transfusion-to-cross-match 
ratio or the shelf life, the lower should be the outdates. The regression for O r 
is given by 



0 r = 



4.11 052(R) 0,66033 (A) 1 57255 (e) 0007,9<s ’~ s ’* 



(d m )°- 88S6 (p) 



,2.54564^)3.01945 



(3) 
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where e is the base of the natural logarithm. 

From the regression results in equation (3), we can see that these 
expectations are true. All of the coefficients are significant at the 0.01 level 
or less and their algebraic signs agree with the above hypotheses. 
Furthermore, these variables explain 71 percent of the variation in the 
dependent variable (r 2 = 0.71). 

This regression function captures the effects of these six variables on the 
outdate rate. These same causal variables were used to explain the variation 
in the shortage rate, except that instead of the deviation (S' - S*), the reverse 
deviation (S* - S') was used. Consequently, if S' < S*, the shortage rate 
should increase because the actual inventory level S' is below the optimal 
level; and if S' > S*, the shortage rate should decrease. The other variables 
are expected to have the same effect on the shortage rate as they did on the 
outdate rate. As d m , P, or L increase, the shortage rate should decrease. As 
R or A increase, the shortage rate should increase. 

As shown in equation (4), these expectations have been realized. All of the 
coefficients are significant at the 0.01 level or less and are of the correct 
sign. The log/exponential linear regression explains 59 percent of the 
variation in the dependent variable. (r 2 = 0.59). The regression equation is 

c 0.09629 (*) 0,17356 (S'-S*) (A) 0.57441 (R) 0.05359 

^ ^0.34867 0.43568 ^1. 09577 ( 4 ) 

where e is the base of the natural logarithms. 

The variations in p and R represent examples of internal management 
policies since p and R are affected by the working relationships between the 
blood bank and the ordering physicians. Variations in A and |S' —S*| 
represent external management since the age and amount of arriving blood at 
the hospital often depend upon the policies of a regional blood center. 
Variations in L are set by government regulations (either 35 or 42 days for 
red cells) and are outside the scope of managerial decision. 

The amounts of shortages, outdates and costs are determined by a complex 
interaction among these environmental and managerial factors. To capture 
the full effects of the benefits from following the optimal inventory policy, 
the other variables must not be allowed to deteriorate, i.e., p should not drop, 
R and A should not increase, and the actual inventoiy level S' should be held 
close to the target inventory level S* given by equation (2). Some of these 
variables are under the control of the blood bank administrator and others are 
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possibly under the control of external administrators such as community 
blood center directors. Significant reductions in shortages and outdates, 
however, can be made by a combination of “good” overall internal and 
external management. 

5.3.5 Optimal blood issuing for crossmatching ( FIFO vs. LIFO) 

FIFO (first in first out) and LIFO (last in first out) issuing policies were 
considered in conjunction with varying values of the cross-match release 
period, R, from 0 to seven days. R = 0 corresponds to an inventory system 
where all crossmatched units are transfused or immediately released after the 
procedure and R = 7 corresponds to a system where non-transfused 
crossmatched units remain in the assigned inventory for a period of one 
week. In all, 16 issuing-crossmatch policy combinations were considered 
(eight for FIFO and eight for LIFO). Simulation was again used to 
determine the optimal issuing policy. 

Table 5.3 gives results averaged over a number of runs and Figure 5.12 is a 
graph of cumulated outdates for increasing values of R for both FIFO and 
LIFO issuing policies. The following observations can be made. 

1. When R = 0, as predicted by the theory of Pierskalla and Roach [21], 
FIFO is optimal in terms of minimizing the outdates and shortages and 
costs (since costs increase as the number of shortages and/or outdates 
increase). 

2. As ? increases under FIFO issuing, outdates increase and when R = 7 
shortages appear. 

3. As ? increases under LIFO issuing, outdates are very large but 
decreasing and shortages increase. 

4. For reasonable R in the range 0-2 days (common in most hospital blood 
banks), FIFO dominates LIFO. 

5. Although not shown, it is possible to generate examples where outdates 
under FIFO exceed those under LIFO for R sufficiently large (greater 
than seven days). 

6. There is great sensitivity to changes in R under both issuing policies. 
The smaller the value of R, the higher is the system performance. 

Choosing between the two policies, FIFO vs. LIFO, for any reasonable 
values of the crossmatch release period, FIFO is optimal and should be used 
by the hospital blood bank manager. Furthermore the hospital blood bank 
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Table 5.3 FIFO vs. LIFO issuing policy results on outdates and 
shortages for crossmatch release periods (R) ranging from zero to 

seven days 



R 


0 


1 


2 


3 


4 


5 


6 


7 


FIFO Issuing 


Transfused 


2272.0 


2272.0 


2272.0 


2272.0 


2272.0 


2272.0 


2272.0 


2242.4 


Outdated 


0 


16.3 


58.7 


117.0 


173.1 


202.1 


272.1 


330.8 


Unassigned 

Inventory 


503.0 






272 


161.9 


121.9 


35.9 


0 


Assigned 

Inventory 


0 


41.0 


80.0 


114.0 


168.8 


179.0 


m 


201.8 


Total 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


Shortage 


0 


0 






0 






72.5 


LIFO Issuing 


Transfused 


2272.0 


2272.0 


2267.8 


2266.2 


2257.2 


2254.0 


2241.0 


2234.4 


Outdated 


477.0 


449.3 


428.6 


403.6 


372.2 


366.3 


357.5 


338.8 


Unassigned 

Inventory 


26.0 


12.7 


m 


B 


B 


m 


B 


B 


Assigned 

Inventory 


■ 


41.0 


78.6 


105.2 


145.6 


154.7 


176.5 


201.8 


Total 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


2775.0 


Shortage 


0 


0 


10.9 


14.8 


47.6 


46.3 


74.6 


90.6 



manager must endeavor to keep R as small as possible in order to minimize 
its effect of increasing outdates and shortages. This effect was also seen in 
the outdate and shortage functions, equations (3) and (4) above. 

5.3.6 Target inventory levels for a community blood center system or a 
centralized regional blood banking system 

In a community blood center or regional blood center, the management of 
inventories of whole blood and components also involves a complex and 
interrelated set of decisions concerning collection, processing, record 
keeping, storage, issuing and transportation of units. In this section, some 
management decision problems are analyzed to determine easily implement- 
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Figure 5.12 Outdates for varying crossmatch release periods for 




ted rules that yield the “best,” or at least “very good” operating results at the 
CBC and its satellite HBBs [10]. 

It has been recognized that benefits can be obtained by pooling resources 
using a community blood center. The most apparent benefit to the hospital is 
that the blood bank staff is relieved of the responsibility of donor 
recruitment, blood procurement and blood processing. This permits the 
hospital blood bank to channel its energies and efforts toward the resolution 
of patient-related transfusion problems. Another advantage to the hospital is 
the opportunity to pool widely fluctuating, largely unpredictable demands 
with those of other hospitals in the system. Within the system the variations 
often cancel each other and produce a smoother, more predictable aggregate 
demand. This will enable member blood banks to maintain lower inventories 
without degrading their outdate and shortage performance. 

Since the demand to which the community blood center must respond is 
generated outside its control, its decision making processes must focus 
primarily on inventory management. While management decisions regarding 
donor recruitment, phlebotomy and processing are essential, they can be 
handled effectively only after efficient optimal inventory control policies 
have been implemented. This control at the community blood center 
requires setting inventory levels to maintain the optimal tradeoff between 
system-wide excess inventory, with consequent outdating, and system-wide 
excess amounts of shortages. 

Inventory levels can be developed for the CBC by using an 
outdating/shortage cost-minimizing procedure similar to that described in the 
previous section for the single hospital blood bank. The optimal inventory 
level at the CBC for each blood type is a function of the number of HBBs 
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served by the CBC, their mean daily transfusions, their transfusion to 
crossmatch ratios and their crossmatch release periods. It is assumed that all 
issuing is done using FIFO. Associated with the optimal inventory function 
for the CBC is an optimal inventory level for each member HBB that is a 
function of its demand and transfusion to crossmatch ratio for that location. 
For those hospital blood banks that belong to a centralized system it is to be 
expected that their optimal inventory levels will differ from the values 
indicated by the decision rule of the previous section which is appropriate 
for independent hospitals in a decentralized system. 

In order to study some of the benefits and shortcomings from a centralized 
blood banking system, a simulation model was constructed. The simulation 
model is described in detail in Yen [14]. Among the issues to be discussed in 
this section are the optimal inventory levels at each HBB (denoted Sj for 
each hospital i), the impact on total system cost of high cross-match to 
transfusion ratios, the allocation of units from the CBC to the HBBs, the 
trans-shipment policy among HBBs and the effect of a limited and some- 
what random supply to the community blood center. In addition, the 
sensitivity of system cost to changes in the number and size of HBBs in the 
system was considered. 

As one might expect, the results indicate that the total amount of optimal 
inventory levels in the hospital blood banks increase at a decreasing rate 
with incorporation of more HBBs into the centralized system. Also, after a 
certain system scale is reached the marginal benefits received from lower 
shortages and lower outdates can be expected to approach zero as more 
HBBs are added to the system. Finally, as more HBBs are included in a 
centralized blood banking system, the total average distances between the 
CBC and the HBBs, as well as their information needs, increase and thus the 
corresponding transportation and information costs increase. So, as HBBs 
are added, a saturation number of hospital blood banks in the system is 
reached, further inclusion of local banks is not likely to reduce the system 
cost per unit and indeed as has been shown previously may lead to 
diseconomies of scale above 200,000 units annually. 

5.3 . 7 Optimal daily inventories at the CBC 

The daily amount of whole blood and components to be maintained centrally 
at the CBC depends upon the amounts maintained at each HBB in the sys- 
tem. If the total inventories at the HBBs are larger than would be optimal, 
then the amount at the CBC should be small and vice versa. However, large 
inventories at the HBBs could result in more outdates and/or 
outdate-anticipating trans-shipments. Similarly, small inventories might 
incur more emergency shipments and/or shortage-anticipating trans- 
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shipments. Because-of these possibilities, there must be a balance between 
the inventory at the CBC and the inventories at the HBBs in the supply 
chain. 

Equations for determining optimal inventories of whole blood and packed 
cells at the HBBs were given in the previous section. Using a similar 
simulation-optimization-regression approach and making the same 
reasonable assumptions concerning the system costs of shortages and 
outdates, the optimal inventory level at the CBC was established. 

The outdate cost consists primarily of the average costs per unit of 
recruiting, processing, storing and transporting one unit. When a unit 
outdates, these costs are basically lost. Actually a more appropriate cost to 
charge for outdates would be the marginal per unit costs of these blood bank 
activities rather than average per unit costs. However, it is not easy to obtain 
actual marginal costs since the cost figures available are not sufficiently 
precise to define the appropriate marginal relationship. Furthermore, since 
the average cost includes many variable items such as bag costs, record 
keeping and hours of work, it is reasonably representative of the marginal 
cost. 

The shortage cost at the CBC was based on the cost for processing and 
handling a unit on an emergency basis and for recruiting and/or trans- 
portation from another source on an emergency basis. Again, marginal costs 
per unit would be better but they were not available. The shortage cost at the 
HBBs was based on the average per unit cost of maintaining a buffer stock 
of frozen blood units either at the HBB or the CBC or shipping a unit by 
emergency shipment from another regional center. Finally, it should be 
recalled that what is important about these costs is not their absolute levels, 
but rather their relative magnitudes. Hence, if inflation should cause them to 
rise in the same relative proportions, the results still hold. Furthermore, the 
results hold even when the relative magnitudes are varied over reasonable 
ranges. 

Many variables were considered in this inventory supply chain analysis to 
find the optimal target inventory levels at the CBC and its independent 
satellite HBBs. A complete list of these variables is shown below. 
However, and somewhat surprisingly, only three of these many variables are 
needed to make optimal decisions in this centrally controlled supply chain. 
The optimal target inventory level at the CBC needs only to know do and N 
and the optimal target inventory levels at the satellite HBBjS need only dj. 
This contrasts for the optimal target inventory level at the independent HBB 
that needs three variables as shown in equation (2). 
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Variables initially considered were the following: 

do = mean demand for type specific whole blood and packed red cells 
at the community blood center. 

dj = mean demand for whole blood (WB) and packed red cells (PRCs) 
at the hospital blood bank. 

It is assumed that demand at the bank is a Neyman type A distributed 
random variable characterized by the mean number of patients per day and 
the mean units requested per patient. Yen [14] demonstrated that the 
Neyman A fits the data well. Other variables initially considered were: 

R = the crossmatch release period, the time lapse before a unit is 
returned to the unassigned inventory if not transfused (in days) 

So = inventory level at the CBC (in units of WB and PRCs) 

Sj = inventory level at location j (in units of WB and PRCs) 

N = number of HBBs in the system 

Pj = probability of a cross-matched unit of WB or PRC being 
transfused at location j 

Oo = shortage at the CBC (in units of WB and PRC) 

Vj = shortage at location j (in units of WB and PRC) 

Oj = outdate at location j (in units of WB and PRC) 

n = number of times a unit of WB or PRC is cross-matched in its 

lifetime 

a = age of a unit of WB or PRC when it is cross-matched for the first 
time. 

The variables do, dj, So, Sj, Vo, Vj and Oj are computed for each group and Rh 
factor; rather than have two subscripts, one for location and the other for 
ABO and Rh, the second subscript has been suppressed for ease of writing 
the results. However, for low volume rare blood groups when the target 
levels and the demands are small, say, one or two units, it is better to 
maintain more stock at the CBC rather than incur excessive trans-shipping of 
units. 
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The optimal level target inventory level at the CBC is: 

S 0 * = 3.14 ( do )’ 72 ( N )' 93 ( 5 ) 

We find R 2 = 0.993 and F = 3529. All coefficients are significant at level 
0 . 001 . 

The corresponding optimal target inventory level for the HBBs which belong 
to and are fully coordinated by the CBC is: 

Sj* = 7.99 (dj) 78 (6) 

Wefind R 2 = 0.995 and F = 8675. The term dj is significant at level 0.001. 

As noted above, all of the other variables that were used in the original 
Cobb-Douglas function were not significant and did not contribute to the 
analysis of variance so they were removed from the regressions and only the 
variables shown in equations (5) and (6) were used in the final analysis. 

The relationship between the level of demand and the optimal inventory 
level in terms of days of blood usage for both an independent bank and a 
member of a central system is illustrated in Figure 5.13. This figure was 
computed from the equations above and from equation (1) for target 
inventory at an independent bank for the case where the transfusion fraction 
at each bank is p = 0.5 and the cross-match release time is R = 2 days. The 
optimal inventoiy level at a hospital blood bank can be reduced by 20 to 50 
percent for an HBB that has its inventory level managed by a community 
blood center. 

5.3.8 Centralized blood bank issuing and allocation policies to HBBs 

After the CBC receives all the requests from the HBBs, the orders are filled 
by drawing from the inventory in the CBC using an oldest to youngest age of 
units issuing policy. For purposes of simplification as well as good medical 
practice, each group and Rh factor is considered independent of the other 
groups and Rh factors. When the sum of all type-specific HBB demands 
exceeds the total inventory in the CBC, the CBC may backlog the excess 
demand or may fill all demands by calling in donors, by contacting other 
CBCs, by using frozen packed red cells or by requesting an emergency 
shipment from still higher echelon (regional) blood banks. In this analysis 
the CBC uses different approaches to handle the excess demand depending 
upon whether the orders are routine or emergency. Routine orders are placed 
by the HBBs at the beginning of each day to build up their inventory to a 
specific level. Emergency orders are placed during the day when the inven- 
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Figure 5.13 Optimal days of inventory to keep on hand to meet 
transfusions for a given blood type 



Figure 13: Optimal Days of Inventory to Keep on Hand to 
Meet Transfusions for a Given Blood Type 
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tory of the HBBs cannot meet their respective users’ demands. For routine 
orders, the CBC will fill the orders as long as its inventory lasts and disre- 
gard the excess demands, if any. Consequently, the HBBs may not receive 
the full amount they ordered. For emergency orders, the CBC still fills the 
orders as long as its inventory lasts. However, if there are excess emergency 
demands, the CBC will attempt to fill them from the inventory of the HBBs 
within the system. Furthermore, if there is insufficient stock in the whole 
system to fill the excess emergency demands, then the CBC will fill them by 
contacting exogenous sources. The rationale of the different treatments for 
the three types of excess demands, i.e., the three types of “shortages” 
between routine and emergency orders, is that the routine orders are used to 
build up the buffer inventory in the HBBs. These routine orders may not 
represent actual transfusion demands that day. Therefore if the excess of the 
routine orders over the available inventory at the CBC is not filled, a true 
shortage will not necessarily occur. On the other hand, the emergency 
orders, if not filled, will most likely create a shortage, since the buffer 
inventory in the HBB has to be essentially depleted before the HBB will 
place an emergency order. 

Since each HBB may not receive all that it has ordered, a systematic process 
is needed to allocate the available stock in the CBC to HBBs. This allocation 
process is called the allocation policy . Essentially there are three distinct 
practical alternatives: 
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1. The CBC picks an HBB and fills its order by the First Come First Served 
(FCFS) issuing policy, and then goes on to fill the next HBB order until all 
the stock runs out or all orders from HBBs are filled. This type of allocation 
process resembles the practice that exists in some blood banking systems. 

2. The CBC ships an amount to each HBB such that the ratio of the amount 
received to the amount ordered is the same for each HBB. Furthermore, all 
shipments have the same ratio of the amount of different ages received to the 
amount ordered. This type of allocation process resembles proportional 
rationing of scarce resources and is intended to be fair to all users with 
regard to their stated target needs by treating each user equitably. (See 
Cohen, Pierskalla and Yen [22] for a theoretical treatment of this problem.) 

3. The CBC ships each unit to the hospital where the shortage probability is 
the highest in the system. In other words, the delivery of each unit is 
intended to adjust the system stock configuration such that total system 
shortage probabilities may be improved. If the target level needs in policy 2 
above are based on shortage probabilities, then this alternative policy 
coincides with policy 2. However, if the target level needs are based on 
some tradeoff between shortages and outdates, then policies 2 and 3 may 
differ slightly. 

After all HBBs receive their orders it may be desirable to trans-ship units 
among them. Basically there are three reasons for, or types of, such trans- 
shipments: an emergency need at an HBB that cannot be met by the CBC; 
the shortage anticipating trans-shipment; and the outdate anticipating trans- 
shipment [14, 23, 24]. If one location anticipates a shortage while another 
location does not, then a trans-shipment from the latter to the former may be 
beneficial to the system in reducing the system shortage cost. Similarly, if 
one location has an excessive amount of old units while another location 
does not, an outdate anticipating trans-shipment can be initiated for the 
benefit of the system. Before a trans-shipment is made, the exact stock 
configurations of the locations, as well as the demand distributions of the 
locations, must be known in order to evaluate the benefit of the trans- 
shipment. When such information is available, the CBC is in the best 
position to direct the trans-shipments in the system. Obviously, for these 
types of actions a sophisticated information processing system is needed. 

In the case where such information is not available, the benefits of trans- 
shipping are uncertain and no trans-shipment should be made directly from 
one HBB to another. However, since each HBB knows its own stock/age 
configuration, it can choose to return excessively old, but still usable, units 
to the CBC. In this way old units are recycled to other hospitals in the 
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system. This particular type of outdate anticipating trans-shipment will be 
called the recycle policy. 

Of the three conditions for a trans-shipment, the most important condition is 
when an HBB has an emergency demand and the CBC does not have 
sufficient stock on hand to meet it. In this case a check of the other HBBs 
should be conducted and a trans-shipment made provided the HBB which 
furnished the units will not be placed in a precarious shortage situation, that 
is, provided the probability of shortage at the sending HBB does not become 
too large after depletion of its stock. 

Less important trans-shipments occur due to shortage or outdate anticipating 
trans-shipments. For shortage anticipating trans-shipments, a unit is trans- 
shipped from location A to B if the shortage probability in A is greater than 
that in B and if the difference of the two probabilities is greater than a cer- 
tain number. The number should be large enough so that the trans-shipment 
will be beneficial to the system. It is calculated according to the following 
formula: 

[shortage probability at A - shortage probability at B] 

> transportation cost/shortage cost (7) 

If the transportation cost is estimated to be about 5 percent of the shortage 
cost, then the number used in the determination of whether or not to trans- 
ship a unit is 0.05 (i.e., initiate a trans-shipment if the differential shortage 
probability is reduced by 0.05). Note that the shortage cost is assumed to be 
the same for all HBBs and the transportation cost is independent of the 
facilities where the trans-shipment occurred. This simplification is justified 
because the majority of the transportation costs are often not the direct costs, 
e.g., gas and time consumed in the shipment, rather the indirect costs related 
to the handling, labeling, accounting and information exchanged between the 
two facilities. All these indirect costs, however, depend upon the size of the 
system. Therefore, the number 0.05 can at best be described as an educated 
guess. 

For outdate anticipating trans-shipments, a unit is trans-shipped from A to B 
if the outdate probability in A is greater than that in B and if the difference 
of the two probabilities is greater than the transportation cost divided by the 
outdate cost: 

[outdate probability at A - outdate probability at B] 

> transportation cost/outdate cost (8) 
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Again, 0.05 was used for this ratio (the number 0.05 is based on similar 
calculations and assumptions as those used above). It should be noted that in 
both cases the number 0.05 is somewhat arbitrary since actual costs are not 
known precisely. However, in the range between 0.03 and 0.20 there appears 
to be no significant difference in the number of units trans-shipped. Indeed, 
for this range, virtually no shortage or outdate anticipating trans-shipments 
will occur [14]. 

One reason why there are few shortage-anticipating trans-shipments stems 
from the allocation policy in the CBC. Recall that units are available for 
trans-shipment only after each HBB has received its delivery. But under 
allocation policies 2 or 3, the units in the CBC are issued one by one to the 
location with the highest shortage probability or proportionally to their target 
needs. So at the end of the allocation process each HBB will have an 
essentially identical shortage probability except when there is insufficient 
inventory in the CBC to make them equal or when there is a tie in shortage 
probabilities before the issuance of the last few units. In both of these cases, 
some discrepancies among shortage probabilities will occur, but they are 
rather negligible under relatively wide ranges of target inventory levels at all 
locations. Consequently, the conditions to initiate shortage trans-shipment 
would rarely occur, hence hardly any units are shortage trans-shipped. For 
this reason the shortage trans-shipment policy has virtually no significant 
effect on the shortages in the system. 

The insensitivity of the outdated units to the outdate trans-shipment policy 
can be explained as well. By observing that a unit will be outdated only after 
several passages through the cross-matching process, the quantity of 
expected daily outdates is fairly small simply because the probability of 
outdate given by (l-pj) n is usually a very small number where n is the 
number of times the unit is cross-matched prior to outdating. Hence, there 
are very few units which outdate, when optimal inventory, issuing, pj and R 
policies are followed, regardless of whether an outdate trans-shipment policy 
is in effect or not. Consequently, the outdate trans-shipment policy can be 
expected to have virtually no significant effect on the outdates in the system. 

It should be mentioned here that the simulation model also indicated that 
while there are some units trans-shipped, the actual quantities were 
insignificant even when the inventory levels at different locations were 
varied over wide ranges. However, if the actual inventoiy levels used are far 
larger than the optimal target inventory levels at the HBBs, then as one 
would expect, outdate trans-shipments would become significant if the 
allocation policy is changed to the FCFS allocation policy. Both of these 
decisions are extreme and should not be followed. That is, the CBC should 
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use optimal target inventory levels and should not use allocation policy 1 
(FCFS). 



We now summarize the best trans-shipment and allocation policies: 

1. Use allocation policy 2. Allocation policy 3 is also good but requires 
more computation and time for implementation. 

2. Trans-ship units from one HBB to another. 

a) If there is an emergency need at an HBB and if the CBC is out of 
stock and if the sending HBB does not incur an excessive 
probability of shortage (say over 10 percent). 

b) If the probability of shortage at HBBi minus the probability of 
shortage at HBBj is greater than or equal to the ratio of unit 
transportation cost to shortage cost. 

c) If the probability of outdate at HBBj minus the probability of 
outdate at HBBj is greater than or equal to the ratio of unit 
transportation cost to outdate cost. 

5.3.9 Optimal cross-matched release and issuing policies from the CBC 

Cohen and Pierskalla [17J show that if a unit is cross-matched at an HBB 
and not reported transfused within a short time (R = 1, 2 or 3 days), further 
information should be obtained on the status of the demand for which the 
unit was issued. If the demand had disappeared, the unit should be made 
available for possible reassignment either at the same bank or another hos- 
pital blood bank. In this manner, the cross-match release time, R, should be 
kept as low as possible. As long as R can be maintained below 4 days, the 
FIFO issuing policy should be followed at the CBC for those HBBs which 
receive daily or at least tri-weekly deliveries from the CBC. If R exceeds 7 
days, last-in first-out (LIFO) will be somewhat better than FIFO but both 
policies will then have excessive outdates and shortages. 

In another study of issuing policies in an HBB [25], it was shown that for a 
department which has low usage and low values of pj, a LIFO issuing policy 
for that department should be followed. The underlying reason why LIFO 
should be followed rather than FIFO is to increase the probability of 
transfusion of the cross-matched unit. This same reasoning applies to some 
HBBs in a CBC system, namely, those HBBs which require infrequent 
deliveries (weekly) and have low transfusion probabilities. This case often 
occurs at small distant rural HBBs. For these HBBs, the CBC should issue 
by LIFO and then at the next delivery pick up any non-transfused units. 
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replacing them with younger units. The slightly older units which are then 
picked up may be made available to HBBs with higher volume needs which 
have higher transfusion probabilities. 

Optimal policies include: 

1. The cross-match release period, R, should be 1 or 2 days (the smaller 
that R is, the lower are the shortages, outdates, and costs). 

2. For HBBs which receive daily, triweekly or biweekly deliveries, the 
units which are shipped to them should be issued on a FIFO basis 
(unless fresh units are needed for special purposes such as cardiac 
surgery). 

3. For HBBs with infrequent deliveries (once a week), the units which are 
shipped to them should be issued on a LIFO basis and unused units 
from the prior shipment should be picked up and replaced with younger 
units. 

5.4 CONCLUSIONS AND FURTHER RESEARCH 

This chapter has considered a number of contributions to the development of 
operational procedures for blood bank management. In regionalization, it 
was shown that economies of scale exist in most of the blood bank 
management functions. Consequently a centralized community blood center 
is more efficient than a decentralized system. In addition, algorithms were 
developed to provide optimal allocation of HBBs and donor sites to CBCs in 
the case in which a region has multiple CBCs. Optimal target inventory 
levels, allocation, trans-shipment and issuing policies were shown for CBCs 
with central and with coordinated controls. Time series methods were 
applied to daily type-specific cross-match and monthly total cross-match 
data. These methods led to models for forecasting mean daily demands that 
are required for inventory control. A simulation model and statistical 
analysis was used to develop a target inventory decision function for 
inventory levels at an independent hospital blood bank, at HBBs that are a 
part of a centralized system and at the CBC. The mean daily demand, the 
transfusion to cross-match ratio and the cross-match release period were 
shown to be significant variables. Many of the key decisions in blood system 
management were analyzed and developed. However, there are still many 
open research questions that should be addressed for a more complete 
understanding of this supply chain. 

In 1984, Prastacos [1] noted some unresolved research issues in his survey 
paper. They are still unresolved today. He noted the need for research on: 




BLOOD BANKING SUPPLY CHAIN MANAGEMENT 141 



Optimal component processing policies Because the demand for 
components has risen greatly due to new medical technologies and therapies, 
upwards of 95% of whole blood units drawn are processed into various 
components. Furthermore many components are being collected by 
pheresis. There are differing quality and cost aspects to these two methods 
that need analysis and modeling. In addition to the practical needs of the 
blood banks in this area of component processing, there is a major need for 
more research in inventory theory for developing and analyzing 
mathematical models in which a common input source is subdivided into 
value added components. Deuermeyer [26, 27] developed optimal inventory 
policies for a product model that also produced a valuable by-product. But 
very little theory has been developed since his work. 

Distribution scheduling of multiple products from the Center to the hospitals 
With the increase in use of components and their differing shelf lives, the 
immediacy of delivery for some of them in order to maximize their useful 
lives combined with the less demanding delivery of relatively long shelf-life 
red cells poses new logistics problems for the CBC. 

Organizational structures for regional systems Although much work has 
been done (as noted above), there are major problems of centralization/ 
decentralization involving contractual relations between the CBC and the 
HBBs. These problems involve, but are not restricted to, who owns the 
blood products and at what points in time, what are the agency relationships 
and how can they be priced to maximize the overall societal benefits vis-k- 
vis the individual parties’ benefits and what are the game relationships 
among the parties and is there equilibria. Here again there is need for theory 
to illuminate the issues and practice to achieve the most desirable results for 
donors and patients. 

Pricing of blood products and inter-regional cooperation To some extent 
there is a war out there. Many of the suppliers are in heavy, mostly negative 
competition among themselves and with many of the HBBs. 

Donor scheduling algorithms Frequently it is the case in a region that 
mobile and in-house drawings and pheresis drawings are seasonally bunched 
or else have seasonal gaps. In either case, the supply is not smoothed to 
meet demand and there is either excess outdating or shortages. Because of 
the very stochastic nature of both the supply and the demand processes, 
adaptive stochastic modeling is needed to improve the system. 

There are many more research areas that could be mentioned but the above 
areas give a flavor of the still large knowledge needs for optimal blood 
products supply chain management. Since some studies [28, 29] have 
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estimated that blood products can be very costly due to their significant 
utilization in many procedures, and this use accounts for about 1% of total 
hospital costs in the United States, small improvements can yield significant 
national savings. 
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SUMMARY 

Elective surgery typically generates 40 percent or more of a hospital’s total 
revenue, and individual surgeons almost always have a net positive 
contribution margin. Perioperative services include surgical operations, pre- 
operative care of patients, and post-operative care. This chapter presents a 
method to identify best practices among hospitals’ perioperative services 
using Data Envelopment Analysis (DEA). This analysis included 44,033 
procedures performed by 3,502 surgeons at 53 non-metropolitan 
Pennsylvania hospitals. Eight procedures, each performed by one surgical 
specialty, were selected. For each hospital, DEA 1) identifies untapped 
markets for surgery; 2) identifies relatively high and low procedure volumes 
among specialties; and 3) suggests a strategy for increasing surgical volume 
for inefficient hospitals. Findings may be used by managers of perioperative 
services to aid in resource allocation decisions, such as hiring and 
recruitment among different surgical specialties. 

KEY WORDS 

Data envelopment analysis, Perioperative services 
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6.1 INTRODUCTION 

Elective surgery typically generates 40 percent or more of a hospital’s 
revenue, and individual surgeons almost always have a positive contribution 
margin [1, 2]. Hospitals depend on their surgeons for a steady flow of 
patients and revenue. Yet the management of the surgical functions within a 
hospital is a very complex and demanding task. This chapter presents a 
method to identify best practices among hospitals’ perioperative services 
using Data Envelopment Analysis (DEA). 

Elective surgery differs from non-elective surgery, such as trauma and 
transplant surgery, in that elective procedures are scheduled in advance. 
Perioperative care begins once the decision is made that a patient will 
undergo surgery at a hospital. We define perioperative services (POS) as 
the sub-system of a hospital that produces elective surgery, pre-operative 
care, and post-operative care. As such, POS is a complex system that uses 
multiple inputs, such as capital and personnel, to produce multiple products, 
such as procedures by specialty (Figure 6.1). POS can be thought of as a 
“hospital-within-a-hospital” that encompass all the functions associated with 
elective surgery. 

Much of POS is isolated from the rest of the hospital, not just practically but 
physically. Personnel cannot enter operating rooms without wearing surgical 
scrubs and masks. Operating room (OR) nursing has little overlap with other 
types of nursing, and requires a year of additional training. Nurse 
anesthetists and anesthesiologists have little non-perioperative work. 
Surgical equipment and anesthesia machines are used under few other 
circumstances. 

The Director of POS is typically a nursing or medical director, and if a 
medical director is usually an anesthesiologist [3]. In allocating scare 
resources, such as OR time, equipment, and staff, the director must weigh 
the demands of different surgical specialties. Operating room nurses in 
hospitals usually focus on three or fewer surgical specialties, as do many 
anesthesiologists. Deciding whether the anesthesiology department’s new 
member has subspecialty training in cardiac surgery or regional anesthesia 
(i.e., for orthopedics) balances one surgical specialty against one another. 
Which specialties are favored can significantly impact surgeons’ flexibility 
and access to POS. 

The strategic factors that determine a hospital’s potential workload for 
elective surgery have been well-established: Erickson and Finkler [4] 
showed that hospital market share is driven by its visibility in the community 
and the number of physicians with privileges at the hospital. They also 
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Figure 6.1 Model of production for perioperative services 




showed that hospital visibility is driven in turn by the number of beds, 
number of services provided, and teaching status. Adams et al. [5] found that 
patients were willing to travel further for teaching hospitals with more acute 
care beds and more sophisticated services. All other things being equal, 
patients have strong preferences for local hospitals [6]. Hence the number of 
potential patients within a hospital’s county and region are important 
predictors of the hospital’s workload [5, 7]. 

The director of POS has little control over the strategic factors that 
determine a hospital’s potential workload, but he or she does have 
significant control over operational factors that determine the proportion of 
the potential workload that is actually done at the hospital. This is 
particularly true in competitive markets where surgeons may have multiple 
hospital affiliations. The perioperative system typically has three 
bottlenecks: access to convenient operating room time, availability of 
specialized surgical equipment, and (for some procedures) open and staffed 
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intensive care unit (ICU) beds. For operating room scheduling, the “jobs” 
are not patients but surgeons, since patients are more flexible in their 
availability. Surgical suites vary dramatically in their flexibility for booking 
cases. If the waiting time is too long, then the patient will likely receive care 
elsewhere, either with the same surgeon or a competing surgeon. On a long- 
term basis, surgeons who cannot get convenient OR time at one hospital tend 
to gravitate to other hospitals that can better meet their needs. Other 
operational factors that influence where surgeons choose to practice include 
availability of specialty-trained nurses and equipment, staff turnover times 
between consecutive cases, and availability of ICU beds. 

6. 1.1 Evaluating the performance of perioperative services 

A number of practical difficulties arise in evaluating the performance of 
POS across institutions. Hospitals differ significantly in factors that 
influence the demand for elective surgery, such as the number of staffed 
beds, technological services offered, and size of the market. Therefore, a 
good evaluation method should compare a hospital’s POS with peer entities 
that operate in a similar environment and use a similar combination of 
resources to produce a similar product mix. The method should 
accommodate system complexity in the form of multiple outputs and 
multiple inputs. The method should capture the tradeoffs faced by managers 
in allocating resources to different specialties, as well as the potential for 
substitution among the inputs [8]. Finally the measure should be clinically 
meaningful and relevant to physicians and OR managers. 

Previous analyses of the performance of POS have mostly used ratio 
methods. Among the ratios that have been used are the following: delay in 
on-time start per case [9]; contribution margin per case [1]; labor cost per 
case [10]; patient waiting time per case [11]; anesthesia drug costs per case 
[12]; and anesthesia relative work units per case [13]. These ratios provide 
one-dimensional measures of how well POS is doing at one task or specialty. 
There is no clear way to collapse these multiple ratios into a single 
performance measure. Moreover, the ratios themselves are based on the 
workload performed at one hospital, rather than comparisons among 
hospitals. They do not measure or predict the facility’s expected 
perioperative workload compared with the best practices at peer institutions. 

DEA offers several advantages over previous ratio methods [8, 14]. First, 
DEA combines multiple ratios into a single ratio of productive efficiency. 
Second, DEA allows for resource substitution among the inputs as well as 
managerial tradeoffs among the outputs. Third, DEA compares each hospital 
to its peers and identifies benchmark facilities for inefficient hospitals. 
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This chapter extends the use of DEA in health care to perioperative services. 
The results of this model can be used by directors of POS to aid in resource 
allocation decisions, such as hiring and recruitment among different surgical 
specialties and capital equipment purchasing. For inefficient hospitals, the 
results can suggest how to increase surgical volumes. The remainder of this 
chapter is organized as follows: In the next section, we review DEA and the 
specific formulations that were chosen for this study. This is followed by a 
description of our data and methods (Section 6.3), results of our analysis 
(Section 6.4), model validation (Section 6.5), and conclusions (Section 6.6). 

6.2 DATA ENVELOPMENT ANALYSIS 



Data envelopment analysis (DEA) is a linear-programming-based technique 
to measure the technical efficiency of Decision-Making Units (DMUs). DEA 
works by estimating a piece-wise linear envelopment surface, known as the 
best-practice frontier DEA is a deterministic, non-parametric technique, and 
thus makes no assumptions about the underlying form of the production 
function or the distribution of error terms. This technique accommodates 
multiple inputs and multiple outputs without prior knowledge of then- 
relative prices. 

DEA has been applied extensively in health care and has been shown to 
offer several advantages over other techniques, such as multivariate 
regression [15], ratio analysis [8], and other econometric approaches [16]. 
For a review of DEA health care studies, see Ozcan [17] and Hollingsworth 
et al. [16]. Areas of application include hospitals [15, 18-20], physicians [8, 
14], nursing homes [21], and health maintenance organizations [22]. This 
chapter extends the use of DEA in health care to perioperative services. 

To estimate the efficiency of surgical hospitals, the CCR (Chames, Cooper, 
and Rhodes) input-oriented model was used [23, 24]. The CCR model can be 
formulated as follows: Suppose that there are n DMUs, each of which uses m 
inputs to produce s outputs. Let Xy (i = 1,..., m) be the amount of input i used 
by DMU j; let Y,j (r = 1,..., 5) be the amount of output r produced by DMU j 
(j = 1, ..., n). The technical efficiency of DMU 0 is then given by 



max h 0 = 






— — <1 j - 



(1) 



(2) 
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u r ^ 0, V/ > 0, V r, i. 

Equation (1) represents the ratio of DMU 0’s virtual output to its virtual 
input. Each DMU is free to choose the weights, u r and v h that maximize its 
efficiency score, with only one set of constraints (equation 2). Efficient 
DMUs are those for which it is possible to find a set of weights for which the 
efficiency ratio is equal to one. Otherwise, the DMU’s efficiency score will 
be less than one and it will be regarded as inefficient. The constant retums- 
to-scale CCR formulation was used because previous studies of physician 
efficiency have not found variable retums-to-scale [8, 25]. There is some 
evidence of increasing retums-to-scale for hospitals owing to horizontal 
integration [26]. 

In order to derive additional information about the hospitals we studied, we 
incorporated extensions to basic DEA including super-efficiency, known as 
the AP (Anderson and Peterson) model [27] and multifactor efficiency 
(MFE) [28]. The AP model is identical to the CCR model, except that the 
self-referential constraint in equation (2) is relaxed, allowing the efficiency 
score to exceed one [27]. The AP model has been used to identify potential 
data errors and to rank efficient DMUs [27, 29]. One drawback to the latter 
approach is that super-efficiency scores tend to be higher for maverick 
DMUs, i.e. those DMUs that place all their emphasis on one output and one 
input in equation (1) [28]. Multifactor efficiency overcomes this weakness 
by using the slack values from the AP model to rate each DMU with respect 
to all output-input combinations. 

A robustness index, R h was calculated to measure the sensitivity of the AP 
scores with respect to changes in the input and output weights: 



_ MFE, 
~ AP, 



0<R,<\ 



(3) 



When Ri is close to 1, the AP score is relatively insensitive to changes in the 
input and output weights. A small value of R t indicates a specialist 
orientation. 



6.3 DATA AND METHODS 

Patient data on inpatient admissions during 1998 from all non-Federal 
Pennsylvania hospitals were obtained from the Pennsylvania Health Care 
Cost Containment Council. Hospital variables were derived from the 1998 
Annual Survey of the American Hospital Association. The study sample 
consisted of the 53 Pennsylvania hospitals that have at least 200 staffed beds 
and are located in non-metropolitan areas.. A non-metropolitan area was 
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defined as a county with a population of less than one million people, based 
on the 1990 census. 

6.3.1 Defining inputs and outputs 

Eight surgical procedures were selected to measure surgical output (Table 
6.1). These eight procedures were chosen to represent a wide spectrum of 
elective, scheduled, inpatient surgical procedures performed at a hospital. 
Each procedure serves as a proxy for the total surgical caseload within its 
respective specialty. Specifically, each procedure is performed by only one 
specialty. For example, we did not include carotid endarterectomy which is 
performed both by vascular surgeons and neurological surgeons, there is 
significant correlation with total inpatient workload for each specialty. Each 
of the eight procedures is correlated with the total inpatient workload for its 
respective specialty. Also, the procedures studied were those that are 
performed once per hospitalization. Thus, the number of hospitalizations is 
proportional to resource use. For example, hip replacement was included but 
not knee replacement, since some patients undergo one knee replacement 
during hospitalization (one such procedure) whereas others undergo bilateral 
knee replacement (two such procedures). Hospital discharges were selected 
based on the six ICD-9-CM procedure codes listed in the hospital discharge 
abstract. 

We used the Diagnosis-Related Groups (DRG) Case-Mix index as a measure 
of the relative resource use of each procedure. The weights were determined 
by the modal DRG weight for hospital discharges including each of the 
procedures (Table 6.1). Coronary Artery Bypass Graft (CABG) had the 
highest DRG case-mix weight (5.65); hysterectomy had the lowest (0.77). 

The eight procedures studied accounted for 7.5 percent of all inpatient 
discharges in the State of Pennsylvania. Hospitals also produce other outputs 
that were not included in this analysis, including outpatient care, medical 
and pediatric inpatient care, non-elective surgical care such as trauma and 
transplant surgery, post-graduate medical training, and research. However, 
this study focuses on the outputs of elective, scheduled perioperative care. 

Hospital size and capacity were measured by the number of staffed beds 
(“BEDS”) and the use of technological services (‘TECH”). Hospital 
technology was measured as the number of high-technology services 
offered, including the following: cardiac catheterization, cardiac surgery, 
shock- wave urological lithotripsy, megavoltage radiation therapy, magnetic 
resonance imaging, organ transplantation, neonatal intensive care, cardiac 
intensive care, and certified trauma care. A constant (c = I) was added to the 
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Table 6.1 Description of input and output variables 



Variable 



Outputs 

AAA 



CABG 



COLO 



CRAN 



HIP 



HYS 



LUNG 



NEPH 



Inputs 



Description 



Abdominal Aortic Aneurysm 

resection Vascular 



s Graft Cardiac 



Colorectal Resection - excision of 
colon and/or rectum General 




Hip Replacement 



Hysterectomy - removal of the 
uterus 




Nephrectomy - removal of a 
kidne 



Explanatory 



AFFIL 



HOSP- 

SURG 



Average number of hospital 
affiliations per surgeon 



Hospital's market share among its 
surgeons 




BEDS 


Number of staffed beds at the 
hospital 


TECH 


Number of high-tech services 
offered at the hospital 


SURGEONS 


Number of surgeons who did at 
least one of any of the above eight 
procedures 


COUNTY 


Number of above eight 
procedures done on residents of 
hospital's county, weighted by 
case-mix 


CONTIG. 

COUNTY 


Number of above eight 
procedures performed on 
residents of hospital's region, 
weighted by case-mix 




RURAL Hospital located in a rural coun 
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technology variable in order to prevent unbounded solutions in the AP 
model due to zeroes in the input data [28]. 

The input “SURGEONS” was defined as the number of surgeons who 
generated at least one hospital discharge for any of the above procedures. 
Most previous studies of hospital efficiency have excluded the number of 
physicians because they are independent contractors who may admit patients 
to multiple hospitals [15]. For our purposes, it is important to include 
surgeons as an input, since they largely determine both the volume and the 
type of procedures that the hospital can perform. 

The demand for surgery depends on the number of surgeons, population 
size, and population demographics, such as age and gender. County demand 
(“COUNTY”) was measured as the total number of the aforementioned 
procedures performed on residents of each hospital’s county, weighted by 
DRG case-mix index. Demand from contiguous counties (“CONTIGUOUS 
COUNTY”) was defined as the number of procedures performed on 
residents of all those counties sharing a common border with the hospital’s 
county. 

6.3.2 Explanatory variables 

Surgeons typically have privileges at multiple hospitals. As the number of 
hospital affiliations per surgeon increases, the surgeon is available less often 
at each hospital [4]. Scheduling access to OR time becomes more 
challenging, resulting in idle capacity in the form of unused OR time and 
empty beds. Therefore, we would expect the efficiency of POS to decrease at 
facilities where the surgeons operate at many other hospitals. To test this 
hypothesis, two measures of hospital-surgeon relations were used: mean 
number of hospital affiliations per surgeon (“AFF1L”), and a hospitals’ 
market share among its surgeons (“HOSP-SURG”). The hospital’s market 
share among its surgeons was defined as the sum of the eight procedures 
performed at the hospital divided by the sum of the eight procedures 
performed by all surgeons with privileges at that hospital. For example, 
suppose 10 surgeons performed 100 procedures at Hospital A and 100 
procedures at all other hospitals. Then Hospital A’s market share among its 
surgeons would be 50 percent. 

Another explanatory variable denoted whether the hospital was located in a 
rural county (“RURAL”), as defined by the Office of Management and 
Budget (www.nal.usda.gov). DEA assumes that hospitals are peer decision- 
making units. If our strategy was successful, our results should not be 
significandy different for rural hospitals. 
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In order to investigate the factors associated with technical efficiency, a 
series of parametric (t-Tests) and non-parametric (Mann- Whitney) tests were 
performed. The log transform of “SURGEONS” and “BEDS” was used for 
the t-Tests. The chi-squared test of independence was used for the 
dichotomous variable “RURAL.” These tests were done as part of 
validation, in order to determine whether the efficiency scores were 
correlated with our input or control variables. 

6.4 RESULTS 

The characteristics of the input and output variables are presented in Table 
6.2. Three of the eight procedures - colorectal resection, hip replacement, 
and hysterectomy - were performed by every hospital. The average number 
of surgeons per hospital was 66. The average number of hospital affiliations 
per surgeon was 1.64. 

DEA identified 24 hospitals as efficient and 29 as inefficient (Table 6.3). 
The average efficiency score was 0.91, based on the CCR model. The AP 
model identified Hospital 38 as the most influential observation, with a 
superefficiency score of 7.67. The second highest AP score was 2.59. 
Hospital 38 is examined in more detail below. 

The MFE and R { measures indicate the robustness of the AP score with 
respect to all output-input combinations [28]. The mean MFE score was 
0.63, compared with a mean AP score of 1.22. Only six surgical hospitals 
had MFEt > 1. Hospital 48 had the lowest robustness index, R 4H = 0.23, 
identifying the hospital as a maverick. 

For inefficient hospitals, DEA provided information on the sources of 
inefficiency, as shown by the slack values in Table 6.4. Hospital 3 produced 
130 fewer CABGs and 184 fewer hysterectomies, compared with the best- 
practice frontier. Overall, the surgical volumes for Hospital 3 could be 
increased by 1AX99 = one percent, while holding all inputs constant. 

Hospital 10 had a positive slack for the number of surgeons (13) as well as 
county market (1,037). This finding indicates relatively low productivity at 
the hospital among its surgeons. The surgeons may face barriers in getting 
surgery done at the hospital. Hospital 10 also had excess surgical demand in 
its county, indicating that this facility was losing market share to other 
hospitals. By contrast, the “county market” slack value is zero for surgical 
Hospital 50, indicating it has a large share of the local market but not the 
regional market, since its slack for CONTIGUOUS COUNTY is positive. 
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Table 6.2 Descriptive statistics for input and output variables 







Standard 






Variable 


Mean 


deviation 


Minimum 


Maximum [ 


Outputs 


AAA 


25 


20 


0 


85 


CABG 


229 


290 


0 


1,113 


COLO 


138 


81 


11 


381 


CRAN 


34 


48 


0 


261 


HIP 


131 


77 


26 


367 


HYS 


226 


173 


27 


893 


LUNG 


28 


24 


0 


110 


NEPH 


20 


20 


0 


119 


Inputs 


BEDS 


312 


112 


203 


659 


TECH 


6 


2 


1 


10 


SURGEONS 


66 


39 


9 


214 


COUNTY 


6,295 


3,924 


406 


14,200 


CONTIG. COUNTY 


39,821 


23,986 


3,484 


83,841 


Explanatory 


AFFIL 


1.64 


0.46 


1.03 


3.20 


HOSP-SURG 


0.72 


0.21 


0.19 


1.00 


RURAL 


0.19 


0.39 


0 


1 



6.5 MODEL VALIDATION AND INTERPRETATION 

In order to test the validity of our model, we now focus on three hospitals in 
more detail and compare our results with other available evidence. 

Hospital 38 is a tertiary facility in an integrated health system and health 
maintenance organization with more than two million enrollees. It is a 
regional referral center for central and northeast Pennsylvania. The hospital 
is located in a small, rural county with a population of 18,000. Hospital 
volumes were high for complex, resource-intensive procedures, such as 
craniotomy and CABG. The slack for county demand for this hospital was 
zero, indicating that it was drawing many of its surgery patients from outside 
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Table 6.3 Efficiency scores for 53 hospitals* 



DMU 


CCR 


AP 


MFE 


R, 


1 


1.000 


1.204 


0.610 


0.507 


2 


1.000 


1.599 


0.978 


0.612 


3 


0.992 


0.992 


0.771 


0.777 


4 


0.912 


0.912 


0.628 


0.688 


5 


1.000 


1.339 


1.113 


0.832 


6 


1.000 


1.215 


0.948 


0.781 


7 


1.000 


1.012 


0.557 


0.550 


8 


0.660 


0.660 


0.458 


0.695 


9 


0.976 


0.976 


0.587 


0.602 


10 


0.858 


0.858 


0.652 


0.760 


11 


1.000 


1.478 


1.052 


0.712 


12 


1.000 


1.137 


0.724 


0.637 


13 


0.706 


0.706 


0.335 


0.475 


14 


1.000 


2.586 


1.005 


0.389 


15 


1.000 


1.291 


0.703 


0.545 


16 


1.000 


1.230 


0.413 


0.336 


17 


0.919 


0.919 


0.554 


0.603 


18 


1.000 


1.252 


0.946 


0.756 


19 


0.728 


0.728 


0.384 


0.528 


20 


0.974 


0.974 


0.385 


0.395 


21 


0.929 


0.929 


0.302 


0.325 


22 


1.000 


1.232 


0.768 


0.623 


23 


0.818 


0.818 


0.411 


0.502 


24 


0.710 


0.710 


0.235 


0.330 


25 


0.752 


0.752 


0.453 


0.603 


26 


0.899 


0.899 


0.672 


0.747 


27 


0.963 


0.963 


0.378 


0.393 


28 


0.993 


0.993 


0.477 


0.481 


29 


0.933 


0.933 


0.456 


0.489 


30 


0.934 


0.934 


0.480 


0.514 


31 


1.000 


2.129 


1.115 


0.524 


32 


0.538 


0.538 


0.313 


0.582 


33 


1.000 


1.220 


0.710 


0.582 


34 


0.836 


0.836 


0.715 


0.855 
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Table 6.3 (cont.) Efficiency scores for 53 hospitals* 



DMU CCR 


AP 


MFE 


R, 


35 


1.000 


1.122 


0.664 


0.592 


36 


0.842 


0.842 


0.353 


0.419 


37 


1.000 


1.067 


0.386 


0.362 


38 


1.000 


7.669 


2.255 


0.294 


39 


1.000 


1.214 


0.639 


0.526 


40 


1.000 


1.325 


0.802 


0.605 


41 


0.561 


0.561 


0.336 


0.599 


42 


0.863 


0.863 


0.270 


0.313 


43 


0.736 


0.736 


0.495 


0.672 


44 


1.000 


1.031 


0.407 


0.394 


45 


1.000 


1.674 


1.179 


0.704 


46 


0.609 


0.609 


0.306 


0.503 


47 


0.987 


0.987 


0.451 


0.457 


48 


1.000 


1.830 


0.423 


0.231 


49 


0.652 


0.652 


0.482 


0.739 


50 


0.833 


0.833 


0.371 


0.445 


51 


1.000 


1.138 


0.712 


0.626 


52 


0.885 


0.885 


0.656 


0.742 


53 


1.000 


2.386 


1.138 


0.477 



* CCR = Chames, Cooper, and Rhodes; AP = Andersen and Petersen; MFE = 
Multi factor Efficiency 

its own county. The craniotomy volumes were in the 96 ,h percentile and the 
CABG volumes were in the 88 ,n percentile. The hospital’s market share 
among its surgeons was 97 percent, the 5 th highest in the sample. This was 
found to be the most influential observation, based on its superefficiency 
score. 

Hospital 48 was found to be an efficient, “maverick” hospital (Table 6.3). 
This facility had some of the fewest surgeons (9), beds (204), and 
technological services (0) of all 53 hospitals. It is located in a small market, 
as measured by county (740) and regional (7,906) demand. Despite its 
difficult operating environment, this facility competed successfully by 
focusing on three procedures: colorectal resection, hip replacement, and 
hysterectomy. These procedures have a relatively low case-mix weight and 
require comparatively low investment in technology. The hospital was 
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Table 6.4 Slack analysis for inefficient hospitals 
Table 6.4a Increased outputs 



Hosp. 


Effi- 

ciency 

Score 


AAA 


CABG 






HIP 


HYS 


LUNG 


NEPH 


3 


99% 


5 


130 


46 


0 


0 


184 


28 


4 


4 


91% 


0 


0 


0 


16 


6 


81 


17 


11 


8 


66% 


0 


0 




10 


0 


9 


8 


4 


9 


98% 


1 


0 


21 


49 


0 


0 


5 


16 


10 


86% 


0 


0 


0 


0 


44 


129 


7 


9 


13 


71% 


13 


94 


0 


16 


0 


0 


11 


8 


17 


92% 


3 


132 


3 


7 


0 


24 


4 


0 


19 


73% 


1 


0 


0 


9 


D 


73 


19 


12 


20 


97% 


8 


78 


0 


25 


21 


0 


14 


0 


21 


93% 


4 


8 


0 


0 


8 


0 


4 


1 


23 


82% 


0 


274 


0 


11 


13 


190 


8 


_ 

7 


24 


71% 


3 


4 


0 


1 


O 


0 


0 


2 


25 


75% 


3 


103 




2 


30 


176 


0 


7 


26 


90% 


mm 


94 


0 


44 


19 


58 


1 


0 


27 


96% 


2 


52 


0 


0 


16 


21 


20 


12 


28 


99% 


10 


249 


0 


29 


34 


0 


0 


5 


29 


93% 


11 


324 


0 


16 


0 


88 


0 


11 


30 


93% 


0 


62 


0 


12 


33 


95 


1 


9 


32 


54% 


0 


255 


0 


9 


28 


51 


0 


4 


34 


84% 


0 


54 


0 


0 


64 


103 


0 


4 


36 


84% 


13 


42 


0 


25 


44 


43 


5 


0 


41 


56% 


9 


156 


0 


5 


16 


66 


0 


9 


42 


86% 


21 


177 


0 


40 


54 


0 


19 


12 


43 


74% 


5 


158 


0 


5 


0 


5 


15 


2 


46 


61% 


14 


155 


38 


18 


32 


0 


4 


0 


47 


99% 


14 


142 


0 


33 


11 


0 


12 


0 


49 


65% 


2 


163 


19 


13 


5 


0 


8 


0 


50 


83% 


8 


159 


0 


29 


12 


54 


19 


0 


52 


88% 


2 


0 


28 


31 


49 


0 




2 
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Table 6.4 Slack analysis for inefficient hospitals 
Table 6.4b Decreased inputs 



Hospital 


Efficiency 

Score 


Beds 


Tech. 

Services 


Surgeons 


County 

Market 


Regional 

Market 


3 


99% 


0 


0 


1 


3,053 


0 


4 


91% 


0 


2 


0 


76 


9,762 


8 


66% 


0 


1 


0 


1,097 


15,957 


9 


98% 


0 


l 


0 


1,679 


22,801 


10 


86% 


0 


0 


13 


1,037 


0 


13 


71% 


0 


0 


0 


651 


0 


17 


92% 


72 


0 


0 


677 


0 


19 


73% 


48 


1 


0 


4,739 


17,990 


20 


97% 


82 


1 


0 


250 


34,818 


21 


93% 


102 


0 


0 


133 


16,871 


23 


82% 


0 


0 


0 


1,935 


0 


24 


71% 


183 


0 


0 


371 


7,476 


25 


75% 


0 


0 


0 


4,102 


33,470 


26 


90% 


0 


0 


17 


1,960 


0 


27 


96% 


27 


2 


0 


10,481 


44,521 


28 


99% 


0 


1 


0 


4,075 


48,602 


29 


93% 


0 


0 


0 


6,988 


44,337 


30 


93% 


35 


0 


0 


1,951 


0 


32 


54% 


0 


0 


0 


3,536 


16,294 


34 


84% 


0 


0 


0 


1,834 


0 


36 


84% 


147 


0 


0 


4,755 


11,780 


41 


56% 


0 


0 


0 


0 


6,048 


42 


86% 


15 


0 


0 


498 


0 


43 


74% 


0 


0 


0 


61 


0 


46 


61% 


0 


2 


0 


2,528 


10,847 


47 


99% 


92 


4 


0 


178 


0 


49 


65% 


0 


1 


0 


958 


0 


50 


83% 


27 


0 


0 


0 


11,417 


52 


88% 


0 


0 


15 


1,911 


17,523 
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efficient to a large extent because its market share among surgeons was 94 
percent, which was in the 88 ,h percentile. This means that 94 percent of the 
cases performed by these nine surgeons were performed at Hospital 48. 

Hospital 10 is located in a competitive market with four other hospitals 
within its county. Table 6.4b shows that this facility has positive slack for 
both surgeons and county demand. This facility’s surgeons have 2.8 hospital 
affiliations on average, the second highest in the sample. Its market share 
among its surgeons was 51 percent, which was in the 19 th percentile. These 
findings corroborate the DEA results. If all the surgeons with privileges 
could be persuaded to admit all their patients to this hospital, then surgical 
volume would almost double. 

The results of the post-hoc analysis for efficient and inefficient hospitals is 
shown in Table 6.5. As expected, efficient hospitals had fewer affiliations 
per surgeon than inefficient hospitals (1.45 vs. 1.80; P < 0.01). The 
distribution of affiliations per surgeon for efficient and inefficient hospitals 
is shown in Figure 6.2. Only two of the 24 efficient hospitals had an average 
of more than two affiliations per surgeon. By contrast, eight of the 29 
inefficient hospitals averaged more than two affiliations per surgeon. 
Efficient hospitals had a higher market share among their surgeons (80 
percent ±0.18 vs. 65 percent ± 0.22). This difference was statistically 
significant for the Mann- Whitney test (P = 0.009). Thus, a hospital’s market 
share among its surgeons was positively associated with its overall POS 
efficiency. 

There were no statistically significant differences between efficient and 
inefficient hospitals with respect to the other variables tested, including beds, 
surgeons, county, contiguous county, and rural location. Thus, there was 
little evidence of increasing retums-to-scale, as hospital POS efficiency was 
not associated with the size of the hospital or market. 

6.6 DISCUSSION AND CONCLUSIONS 

DEA has been applied extensively to other areas of health care, but this is 
the first study to apply DEA to hospitals’ perioperative services. This study 
has demonstrated the usefulness of DEA in capturing the complexity and 
managerial tradeoffs that characterize perioperative services. In addition to 
measures of hospital capacity, our analysis included market factors that are 
significant predictors of surgical demand. 

This study found that the strength of a hospital’s relations with its surgeons 
is an important predictor of POS efficiency (P< 0.01). Hospitals having 
stronger relationships with their surgeons were more likely to be efficient. 
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Table 6.5 Differences between efficient and inefficient hospitals 



Variable 



Affiliations per Surgeon 



Efficient 



Inefficient 



Mann- 

Whitney 

Mean t-Test U 




Hospital's Market Share 
Among Surgeons 



Efficient 



Inefficient 



Beds 



Efficient 



Inefficient 



Surgeons 



Efficient 



Inefficient 



County 



Efficient 



Inefficient 



Contiguous County 



Efficient 



Inefficient 



1.45 2.96 



1.80 0.005 



195 Test statistic 



0.006 p-value 





Rural 



Efficient 



Inefficient 



201 


Test statistic 


0.009 


p-value 




346 



284 0.062 



-1.35 Test statistic 



0.177 p-value 




76 -1.28 



58 0.205 



-1.39 Test statistic 



0. 1 66 p-value 




6,080 0.36 



6,473 > 0.2 



-0.58 Test statistic 



> 0.2 p-value 




36,318 



42,719 




-1.03 Test statistic 



> 0.2 p-value 




0.208 



0.172 



0.111 Test statistic 



> 0.2 I p-value 



Based on the chi-squared test of independence 



















































































EVALUATING HOSPITALS ’ PERIOPERATIVE SERVICES 1 65 



Figure 6.2 Histogram of average affiliations per surgeon 
for efficient and inefficient hospitals 
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This finding is not surprising, since surgeons largely determine surgical 
volumes. In rural areas, larger hospitals may have a captive market for 
surgeons, since surgeons’ choices may be limited by geography and other 
factors. In more competitive markets, hospitals may have to work harder to 
satisfy their surgeons and ensure a steady stream of patients. 

Hospital managers and executives can use these results in several ways. For 
inefficient hospitals, DEA suggests ways to increase hospital volume. A 
positive slack for the number of surgeons indicates that the hospital has more 
surgeons than would be expected given its current surgical volume. This is 
an indication that the hospital needs to improve its maiket share among its 
surgeons. This could be accomplished by reducing scheduling delays in the 
OR, offering more amenities, or reducing turnover times between cases. 
Positive slack in some but not all procedures provides insight into which 
surgical specialties to focus on in capital equipment purchasing, recruiting 
sub-specialty trained anesthesiologists, and in training OR nurses. These are 
all operational factors that are under the control of the director ofPOS. 

Future research should compare the DEA results with parametric, 
regression-based approaches in order to identify the comparative advantages 
of each method. Future research should also adapt this model to metropolitan 
areas where competition among hospitals for surgeons and patients is more 
intense. 
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SUMMARY 

This chapter uses windows and cone ratio analysis - a longitudinal and 
weight- restriction application of Data Envelopment Analysis (DEA) - to 
develop a methodology for analyzing organizational performance of 
community mental health centers (CMHCs); the chapter also develops 
measures of efficiency as a basis for improving productivity in behavioral 
health care. Specifically, non-hospital services provided by CMHCs were 
studied. Data limitations are noted in relation to use of the method and to 
the results. The model is shown to capture the impact of managed care on 
CMHC efficiency. The cone ratio version of the model, using weight 
restrictions, identified six super-efficient CMHCs, which had been 
consistently efficient since the implementation of managed behavioral care. 
The potential usefulness of this method for public and private mental health 
systems and for managed care companies is discussed. 
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Windows analysis. Efficiency, DEA, Weight restricted model. Mental health 
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7.1 INTRODUCTION 

The 1990s brought many challenges to behavioral health care organizations 
to improve the efficiency of their health care delivery. In response, 
behavioral health care organizations implemented different forms of 
managed care. Of particular importance has been the dramatic increase in 
managed care programs under Medicaid. Managed care contracts use 
pricing mechanisms to influence the use of services by controlling the 
amounts paid to health care providers and professionals. Effective cost 
control should of course be accompanied by a thorough understanding of the 
varying services provided by different mental health care providers, as well 
as by the use of good practice protocols for treating mental health 
conditions. 

This chapter reports on a pilot investigation that uses data envelopment 
analysis (DEA) to develop methods for studying the technical efficiency of 
providers of community mental health care, in order to improve productivity. 
It focuses solely on the care of the seriously mentally ill (SMI) patients who 
receive services reimbursed by Medicaid. We examined 12 community 
mental health centers (CMHCs), all receiving traditional fee-for-service 
Medicaid reimbursement in years 1-2 (1994 and 1995). In years 3-5 (1996- 
1998), a mandatory, capitated Medicaid managed care program was 
implemented in the geographic areas served by eight of those CMHCs. 

The measures of community mental health efficiency that we developed are 
tested by comparing the longitudinal patterns of provider efficiency over a 
five-year time frame, before and after implementation of the mandatory 
Medicaid managed care plan. We compare efficiency scores between the 
managed care site - the Tidewater area - and the control site - Richmond, 
Virginia. We also develop a structured method for identifying the effects of 
data limitations and the effects of ongoing modifications in managed care 
plans on the interpretation of findings. 

The methods we developed offer a replicable, objective methodology that 
can be used to compare the operational efficiency of different types of 
providers who care for similar populations of clients. The methodology 
identifies consistent measures for comparison - numbers of patients treated - 
and provides a means of aggregating information on different numbers of 
patients to serve as a measure of organizational performance. 

This methodology could be useful to public mental health systems as well as 
to private and public managed care companies, because it can identify the 
combinations of services that result in the most efficient care. That 
information can be used to change the mix of services that a managed care 
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company will reimburse, and/or those that a provider chooses to use. 

7.2 BACKGROUND 

7.2 . 1 Relevance 

The cost of health care in the United States was $943 billion in 1996, with 
over 10% of that money ($99 billion) spent on behavioral health care. 
Mental health disorders consumed 7% of the health care costs, with 
Alzheimer’s disease/dementias and addiction disorders consuming 2% and 
1% of total costs, respectively [1]. Eighteen percent of the expenditures on 
mental health went to multi-service mental health clinics, which include 
community mental health centers [1]. From 1986 to 19%, mental health 
costs rose 1% less than overall health costs did. One explanation for the 
lower rise is more use of cost-containment strategies by managed care 
companies, which resulted in increased efficiency and lower expenditures on 
mental health care [1]. Other possible reasons are Medicaid program design, 
reductions in inappropriate hospitalizations, use of non-mental health 
services, and the shift of mentally ill persons from inpatient care to the 
community [1]. 

Twelve percent of the United States population is covered under Medicaid 
for their health care. Medicaid’s cost for behavioral health amount to 19% 
of its expenditures; per capita annual Medicaid mental health expenditure is 
approximately $481 [1J. These costs justify examination of efficiency in the 
provision of the mental health services by community mental health 
organizations. 

It has been suggested that, to be effective, health insurance plans should 
provide the following services: 24 hour care/hospitalization, intensive 
community services, outpatient services, medication management, 
psychosocial rehabilitation, and outreach services (Frank et al. in [1]). It is 
less clear, however, which combinations of services are optimal for the 
treatment of specific population sub-groups. 

Effective service delivery must result in desirable outcomes for patients. At 
the same time, budget constraints dictate that these outcomes must be 
achieved in a financially responsible manner: CMHCs must provide 
effective services with efficiency. The efficiency of mental health care 
providers is understudied and will be the focus of this chapter. We 
demonstrate the use of DEA as a methodology to answer the following 
questions: 1) How can different mental health services be used together for 
optimal efficiency? 2) How important are specific services in the overall 
efficiency and use of mental health services? 3) How do community mental 
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health organizations compare in overall efficiency? These questions are 
explored within the context of the implementation of a Medicaid managed 
care program. 

Efficiency has been evaluated in other states that have implemented 
Medicaid managed care programs. Results have been inconsistent. In 
Massachusetts the implementation of Medicaid managed care led to a 25% 
reduction in costly inpatient care, but an increase in rehospitalizations. In 
Utah there was no effect on use of inpatient care, but some differences in 
outpatient care. Utah enrollees with the worst mental health had the least 
improvement when Medicaid managed care was implemented. An 
evaluation of the Colorado program noted that utilization management did 
not focus as much on outpatient care as on inpatient care and that “utilization 
management strategies to provide outpatient services more efficiently are 
insufficient” [1-4]. 

7.2.2 Data envelopment analysis 

DEA evaluates organizational performance by considering multiple inputs 
and outputs to identify the most efficient providers. DEA has been 
successfully applied in many industries [5], including the study of health 
care organizations and professionals. Sherman [6] and Nunamaker [7] were 
among the first to apply DEA measures to hospitals, examining hospitals in 
Massachusetts and Wisconsin, respectively. Huang and McLaughlin [8] 
applied DEA to programs for rural primary health care; Sexton and 
colleagues [9] applied DEA to the Veterans Administration Medical Centers. 
Applications of DEA in health proliferated in the 1990s, including studies of 
physicians [10-12], mental health programs [13, 14] nursing homes [15], 
aging agencies [16], and hospitals [17-19]. Collectively, these studies 
demonstrate that DEA is an effective research tool for evaluating the 
efficiency of health care providers, given varying input mixes and types and 
numbers of outputs. 

DEA uses linear programming to search for optimal combinations of inputs 
and outputs, based on the actual performances of decision making units, in 
this case, CMHCs. In this chapter, we use DEA to evaluate the technical 
efficiency of each CMHC relative to “optimal” patterns of production. 
Patterns are computed using the performance of CMHCs whose inputs and 
outputs are not optimized by those of any other comparison or peer CMHC. 
DEA computes the relative efficiencies with which CMHCs combine major 
categories of inputs to generate general categories of outputs typically 
produced by providers. Controllable and uncontrollable inputs/outputs are 
taken into consideration, as is the size of each CMHC. 
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DEA also calculates inefficiency values for each CMHC. The inefficiencies 
are the degrees of deviance from the frontier. Input inefficiencies show the 
degree to which inputs must be reduced for the inefficient CMHC to lie on 
the efficient practice frontier. Output inefficiencies are the needed increase 
in outputs for the CMHC to become efficient. If a particular CMHC either 
reduces its inputs by the inefficiency values or increases its outputs by the 
amount of inefficiency, it could become efficient; that is, it could obtain an 
efficiency score of one. 

Various types of DEA models can be used, depending upon the problem at 
hand. The DEA model we use can be distinguished by the scale and 
orientation of the model. If one cannot assume that economies of scale do 
not change as the size of the service facility increases, then a variable- 
retums-to-scale (VRS) type of DEA model, the one selected here, is an 
appropriate choice (as opposed to a constant-retums-to-scale, (CRS) model). 
Furthermore, if in order to achieve better efficiency, managers’ priorities are 
to adjust their inputs (before outputs), then an input-oriented DEA model 
rather than an output-oriented model is appropriate. 

The way in which the DEA program computes efficiency scores can be 
explained briefly using mathematical notation (adapted from [20]). 

The VRS envelopment formulation is expressed as follows: 

VRSp (Y|, X|, u 1 , v 1 ): min-(u's + v'e) 

YX-s = Y, 

-XX-e = -X, 

IX = 1 



X>0, e>0, s>0 

For decision making unit 1, x u > 0 denotes the i lh input value, and y r t £ 0 
denotes the r * output value. X, and Y| denote, respectively, the vectors of 
input and output values. Units that lie on (determine) the surface are deemed 
efficient in DEA terminology. Units that do not he on the surface are termed 
inefficient. Optimal values of variables for decision making unit 1 are 
denoted by the s- vector s', the m- vector e 1 , and the n- vector X 1 . 

Although DEA is a powerful optimization technique that can assess the 
performance of each CMHC, it has certain limitations. When one has to 
deal with large numbers of inputs and outputs in service production, and a 
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small number of organizations are under evaluation, the discriminatory 
power of the DEA is limited. However, analysts can overcome this 
limitation by including only those factors (input and output) that provide the 
essential components of service production, thus avoiding distortion of the 
DEA results. This is usually done by eliminating one of a pair of factors that 
are strongly positively correlated with each other. 

In the majority of studies using DEA, the data are analyzed cross- 
sectionally, with each decision making unit (DMU) - in this case the CMHC 
- being observed only once. Nevertheless, data on DMUs are often 
available over multiple time periods. In such cases, it is possible to perform 
DEA over time, where each DMU in each time period is treated as if it were 
a distinct DMU. This DEA technique is called window analysis [21]. Using 
window analysis, one can examine changes in efficiency over time. A 
DMU’s performance in an initial time period is compared to its performance 
in later time periods, and compared as well to the performance of the other 
DMUs. We employed window analysis to assess the changes in CMHC 
efficiency over time [22]. 

7.3 METHODS 

7.3.1 Data and data sources 

The primary source of data was the Department of Medical Assistance 
Services (DMAS) of Virginia. DMAS has extensive claims files that are 
made available for research purposes. This database records dates of 
services for each claim, and its diagnosis, procedure, and patient profile. 
Medicaid data come in three files: claims, recipients, and providers. The 
claims data set includes dates of services for each claim, its diagnosis, 
procedure, and patient profile. The recipient data set contains eligibility 
information on recipients of Medicaid. The provider data set contains the 
provider’s location, practice type, and specialties. Data for five consecutive 
years of fee-for-service care (calendar years of 1994 through 1998), 
including two years of managed care encounters (1997 and 1998) were used. 

Data were provided by the Virginia Medicaid agency with the cautions that 
managed care data have not been evaluated for reliability and validity and 
that there are known data quality concerns. All variables from the encounter 
data set are considered to have quality limitations, which we will point out 
and which should be considered in interpreting preliminary findings. 

7.3.2 Sample selection and analysis plan 

Only patients with Serious Mental Illness (SMI) were included in the study. 
Patients were identified as SMI patients using ICD-CM-9 diagnosis codes in 
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the range of 295.00 - 298.99 (schizophrenia, major affective psychosis, 
paranoid states, and other non-organic psychoses). The claims of SMI 
recipients were merged and aggregated to the unit level of the community 
mental health center (CMHC), also known as a Community Services Board 
(CSB) in Virginia. To ensure experience and consistency in services for SMI 
patients, we examined 12 CMHCs (the providers) that had treated 100 or 
more claims of SMI patients in case management services in 1994. Those 
12 CMHCs also were examined in the next four consecutive years (1995, 
1996, 1997, and 1998). A total of 60 (12 x 5 years) CMHCs were thus 
included in the analysis and constituted the unit of analysis. 

7.3.3 Variables 

We included outcome and resource measures for community mental health 
centers derived from the DMAS database. These measures comprise two 
output and six input variables. Output variables are: the number of Medicaid 
SMI patients in the Medicaid eligibility category of supplemental security 
income (SSI), and the number of patients in all Medicaid eligibility 
categories except SSI. This categorization of outputs is a proxy for outputs 
“more severe” and “less severe”, and hence serves as the case-mix difference 
for outputs. Inputs that we included (measured by number of claims in which 
these services appear) are as follows: 

use of non-emergency crisis support in CMHC; 

use of outpatient assessment; 

use of outpatient therapy; 

use of outpatient medication management; 

use of clubhouse; 

use of case management. 

The above services are those most frequently provided by the CMHCs. 
Three of the services remained fee-for-service throughout the five-year time 
frame we considered, in both Richmond and Tidewater. Certain services - 
the use of crisis support, clubhouse, and case management - were part of a 
special program, the State Plan Option program, which was available in both 
settings with fee-for-service reimbursement. State Plan Option services 
support successful community care and residence. Non-emergency crisis 
support offered by CMHCs includes their crisis intervention services in the 
community, with the goal of stabilizing the client and allowing him or her to 
remain in the community. The clubhouse service is a psychosocial 
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rehabilitation program that provides a supportive environment and promotes 
independent living in the community. The case management services 
include coordination and integration of care and services for the client. 

In the Tidewater area, three services were covered by the capitated managed 
care plan in the last three years of our study - outpatient assessment, 
outpatient therapy, and medication management. 

The remaining services, which changed from fee-for-service to managed 
care in the Tidewater area, include outpatient assessment, therapy, and 
medication management. These services are limited to those provided by 
CMHCs as the billing providers. By definition, these are outpatient 
providers. Assessment is defined as a psychiatric diagnostic interview 
(procedure code 90801). Therapy includes individual therapy (excluding un- 
timed billings), family therapy, and group therapy. Medication management 
includes prescriptions and evaluation of medication needs. It does not 
include administration of medications. 

Only services provided by the CMHCs and paid for by Medicaid are 
included in the analysis. CMHCs may also provide services to clients that 
are not covered by Medicaid; these are not included. 

7.4 RESULTS 

7.4. 1 Trends for SMI patients 

Table 7.1 shows the number of SMI claims in the study localities, by years. 
The percentage of the sample comprising SMI Medicaid recipients rose 
steadily from 9% in 1994 to 13% in 1998. Medicaid managed care for 
behavioral care was implemented in 1996 at the Tidewater site. Thus, it is 
prudent to examine the descriptive statistics for pre- and post-managed care 
in the experimental (Tidewater) and control (Richmond) groups of CMHCs. 

Table 7.2 provides descriptive statistics for all output and input variables 
before and after managed care at both sites. For each variable, the table 
shows both its mean (first row for each variable) and standard deviation 
(second row). There is a notable difference in the volume of outputs from 
pre-managed care to post-managed care in Richmond and in the Tidewater 
areas. Furthermore, average output in the Tidewater area generally is higher 
than that in Richmond. On the input side, there are varying practice patterns 
between the two areas. However, these are not statistically significant, with 
the exception of outpatient assessment. 
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Table 7.1 Trends for Serious Mental Illness (SMI) by site: Number of 

claims by year* 



Year 


Tidewater Area 
Managed Care Site 


Richmond 
Fee-for-Service Site 


1994 


2,659 


1,500 


1995 


1,754 


954 


1996 


2,594 


1,817 


1997 


2,601 


2,132 


1998 


2,681 


2,070 



* The 1995 data had missing recipient and claims data, so the reported figures are an 
undercount. Data for 1994, 1995, and 1996 include only fee-for-service claims. 
Data for 1997 and 1998 include fee-for-service and managed care claims. 



7.4.2 Windows analysis - Efficiency results 

Evaluations were performed using the data from 1994 through 1998. The 
windows analysis method of DEA developed by Chames and colleagues 
[21] was employed. 

Efficiency results are presented in Table 7.3. Windows of five years for the 
12 CMHCs show that 31 occurrences out of 60 are classified as efficient. 
Since Richmond had four CMHCs, and Tidewater had eight CMHCs in this 
study, during the five-year window only seven efficiency results (out of 20) 
were observed from Richmond CMHCs. The average efficiency score for 
Richmond CMHCs was 0.753. 

On the other hand. Tidewater CMHCs displayed much higher efficiencies, 
with an average score of 0.895, with 24 occurrences (out of 40) observed 
during the same five-year window. 

The five-year trend of efficiency scores (shown later in Table 7.6) displays a 
generally increasing trend for the Tidewater CMHCs, but stagnation, even 
retrenchment in the Richmond CMHCs. Table 7.4 compares efficiency 
scores before and after managed care implementation in both localities, and 
shows that efficiency at the Tidewater CMHCs is significantly higher after 
managed care than before. 
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Table 7.2 Descriptive statistics before and after managed care* 





Before Managed Care 
(1994 and 1995) 


After Managed Care 
(1997 and 1998) 


Variables 


Richmond 

N=8 


Tidewater 

N=16 


Richmond 

N=8 


Tidewater 

N=16 


Outputs 










Numbers of Medicaid 
SMI patients treated 
in the Medicaid SSI 
eligibility category* ** 


185.38 

(171.23) 


266.56 

(175.73) 


281.88 

(276.61) 


428.75 

(301.22) 


Numbers of patients 
treated in all Medicaid 
eligibility categories 
except SSI** 


30.50 

(15.84) 


50.88*** 

(39.95) 


84.00 

(74.29) 


98.25 

(86.49) 


Inputs 










Non-emergency crisis 
support 


111.95 

(137.46) 


172.75 

(161.47) 


224.88 

(312.92) 


199.38 

(166.83) 


Outpatient assessment 


16.55 

(11.95) 


42.94*** 

(51.46) 


282.13 

(526.20) 


67.19 

(55.74) 


Outpatient therapy 


241.75 

(106.70) 


349.13 

(450.74) 


439.13 

(391.30) 


306.88 

(233.66) 


Outpatient medication 
management 


724.13 

(775.70) 


689.88 

(695.96) 


715.13 

(622.89) 


820.50 

(524.11) 


Clubhouse 


1008.25 

(1187.40) 


1108.18 

(1450.79) 


2388.25 

(3413.87) 


1020.88 

(1062.21) 


Case management 


1309.00 

(1424.11) 


1351.25 

(1206.53) 


1930.25 

(1616.21) 


1710.75 

(1423.81) 



* N=48; Mean numbers are shown in the first row for each entry; the entry in the 
second row (shown in parentheses) is the standard deviation. 

** SSI = supplemental security income 

***P<0.01 
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Table 7.3 Efficiency results* 





Richmond 


Tidewater 


Number of results 






Efficient 


7 


24 


Inefficient 


13 


16 


Total 


20 


40 


Efficiency Score 






Efficients included 


0.753 (0.21) 


0.895 (0.16) 


Efficients excluded 


0.619(0.13) 


0.737(0.15) 


*N=60; numbers not in parentheses represent mean values; numbers shown in 
parentheses are standard deviations. 


Table 7.4 Technical efficiency score differences between Richmond 

and Tidewater areas* 




Richmond 

(N=8) 


Tidewater t-Value for 
(N=16) Mean 


Before Managed Care 
(1994 and 1995) 


0.768 (0.20) 


0.865(0.17) 1.26** 


After Managed Care 
(1997 and 1998) 


0.748 (0.21) 


0.938(0.14) 2.30*** 



*N=60; numbers not in parentheses represent mean values; numbers shown in 
parentheses are standard deviations. 

**P<0.01 
*** P < 0.05 



7.4.3 Inefficiency score differences between Richmond and Tidewater areas 

The sources of inefficiency are investigated and depicted in Table 7.5 for 
pre- and post-managed care in both localities. Tidewater CMHCs generally 
reduced inefficiencies after the implementation of managed care, as reflected 
by their efficiency scores. To do so, CMHCs must increase their outputs 
while reducing their inputs, or reduce their inputs while keeping the outputs 
steady. The majority of Tidewater CMHCs achieved this goal, but not 
completely. There is a room for further input reduction for the inefficient 
Tidewater and Richmond CMHCs. For example, after the implementation 
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Table 7.5 Inefficiency score differences before and after managed 

care* 



Before Managed Care After Managed Care 
(1994 and 1995) (1997 and 1998) 



Input Variables 


Richmond 

N=5 


Tidewater 

N=8 


Richmond 

N=5 


Tidewater 

N=5 


Non-emergency crisis 


40.10 


36.19 


105.54 


19.32 


support 


(65.54) 


(56.88) 


(136.89) 


(53.00) 


Outpatient assessment** 


4.85 

(6.05) 


16.97 

(36.45) 


230.90 

(505.27) 


4.03 

(7.54) 


Outpatient therapy** 


142.27 

(146.91) 


49.17 

(65.48) 


181.59 

(319.09) 


60.27 

(112.54) 


Outpatient medication 


325.14 


221.56 


184.59 


100.27 


management** 


(562.12) 


(377.27) 


(245.62) 


(272.72) 


Clubhouse 


359.09 

(796.00) 


342.60 

(1041.54) 


1745.43 

(3156.75) 


333.10 

(876.91) 


Case management 


467.42 

(808.65) 


204.42 

(273.70) 


542.87 

(671.77) 


111.53 

(270.36) 



*Numbers not in parentheses represent mean values; numbers shown in parentheses 
are standard deviations. 

** For 1997 and 1998 Tidewater values, the encounter data set used as the source of 
data has quality issues; caution should be used in interpreting these results. 



of managed care, inefficient Tidewater CMHCs (rightmost column of Table 
7.5) are using 19 more units of crisis support for non-emergency care, 4 
more outpatient assessments, 60 more instances of outpatient therapy, 273 
more instances of outpatient medication management, 877 more clubhouse 
arrangements, and 112 more instances of case management, than their 
efficient counterparts do. In other words, other CMHCs with similar profiles 
use much fewer resources to provide similar outputs than the inefficient 
CMHCs do. The magnitude of the inefficiencies and the improvement 
needed for Richmond CMHCs are more even more dramatic than is the case 
for the inefficient Tidewater CMHCs. 

7.4.4 Cone ratio model - Weight restrictions and practice patterns 

DEA can also be used to direct provider behavior toward those practice 
styles found to be not only effective but also cost efficient. This can be 
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accomplished either by utilizing a weight-restricted DEA model [11, 23], or 
by calculating preferred ratio constraints and restricting each CMHC’s ratio 
to a particular level. Weight-restricted DEA calculations limit the use of 
certain virtual inputs and outputs, thereby creating efficiency scores relative 
to the frontier defined in the preferred efficient practice style. 

Figure 7.1 is a conceptualization of a model with two inputs - use of case 
management and use of non-emergency crisis support - and one output - 
Medicaid SMI patients with SSI eligibility. - In the example illustrated in 
Figure 7.1, there are 12 CMHCs and three practice styles. Practice Style 3 
can be defined as a case-management oriented model. Here case 
management is designated as the preferred type of treatment management, 
i.e. preferred over more expensive treatments. The ratio constraints can be 
defined as case management over non-emergency crisis support; outpatient 
medication management over outpatient therapy. When restricted by these 
preferred ratio constraints, the efficiency frontier includes only that section 
creating the preferred practice style. 

To create the preferred ratio constraints that are used to define the practice 
styles, DEA weights (also referred to as prices or multipliers) are utilized. 
The desired ratio(s) are calculated using the input weights from each CMHC. 
Then, for each ratio created, the minimum, first quartile, median, third 
quartile, and maximum values are calculated. These values illustrate the 
distribution of the ratio and give the researcher choices about the level at 
which to restrict the distribution. How much restriction is placed on a 
particular ratio depends on the distribution level selected (usually median or 
third quartile values are selected initially); in the current analysis, we used 
third quartile values. These newly restricted ratios can be plugged back into 
the DEA model and will restrict the use of those selected inputs needed to 
reach the efficiency frontier. 

A graphic conceptualization of a weight restricted model using the two- 
inputs-one-output model is shown in Figure 7.2, where the area identified as 
“Care Management Cone” exemplifies a balanced approach for efficient 
management of mental health patients. 

We analyzed two CMHC practice styles, as shown in Table 7.6. The first 
model (Base Model) contains no ratio constraints, and illustrates practice as 
usual. The second model is a cone ratio model, which uses weight 
restrictions. This model includes the preferred ratios: case management over 
non-emergency crisis support, and outpatient medication management over 
outpatient therapy. 
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Figure 7.1 DEA conceptualization of CMHC Practice Styles 




Figure 7.2 Conceptualization of super efficient CMHC model 



Non- Emergency 
Crisis Support 



P 3 



1.2 -j 
1.0 \ 
0.8 
0.6 
0.4 1 

i 

i 

i 

0.2 \ 



Pii / 



* 
i \ 
i \ 



2 S 



\ £ |J 

I P \\ I 

; / X: 

i .• ” 



Care Management Cone 

Pa 



P/ 






I / 
I ' 

I ' 



12 



L 1 ' 

0 ^ 
0 



6 8 10 
Case Management 



12 



14 







1 84 OPERATIONS RESEARCH AND HEALTH CARE 



Table 7.6 Base- and weight-restricted model efficiency scores by 

year* 





Base Model 


Cone Ratio Model 




Richmond 


Richmond 


CMHC 


1994 


1995 


1996 


1997 


1998 


1994 


1995 


1996 


1997 


1998 


R-l 


0.74 


1.00 


1.00 


1.00 


0.65 


0.68 


0.86 


0.74 


1.00 


0.65 


R-2 


0.53 


1.00 


0.50 


0.63 


0.58 


0.53 


1.00 


0.41 


0.45 


0.49 


R-3 


0.70 


0.55 


0.96 


1.00 


1.00 


0.67 


0.53 


0.84 


0.79 


0.61 


R-4 


1.00 


0.62 


0.47 


0.58 


0.54 


1.00 


0.60 


0.46 


0.58 


0.53 


Average 


0.74 


0.79 


0.73 


0.80 


0.69 


0.72 


0.75 


0.62 


0.71 


0.57 




Tidewater 


Tidewater 


CMHC 


1994 


1995 


1996 


1997 


1998 


1994 


1995 


1996 


1997 


1998 


T-l 


0.66 


0.93 


1.00 


1.00 


1.00 


0.65 


0.92 


1.00 


1.00 


1.00 


T-2 


1.00 


0.61 


0.61 


0.49 


0.72 


1.00 


0.62 


0.60 


0.41 


0.72 


T-3 


0.68 


1.00 


0.68 


0.90 


1.00 


0.67 


1.00 


0.68 


0.89 


0.98 


T-4 


0.55 


1.00 


1.00 


0.96 


1.00 


0.51 


1.00 


1.00 


0.87 


1.00 


T-5 


1.00 


1.00 


0.66 


1.00 


1.00 


1.00 


1.00 


0.55 


1.00 


1.00 


T-6 


0.87 


1.00 


1.00 


1.00 


1.00 


0.87 


1.00 


1.00 


1.00 


1.00 


T-7 


1.00 


1.00 


1.00 


1.00 


1.00 


0.72 


0.71 


1.00 


0.85 


1.00 


T-8 


0.78 


0.76 


1.00 


0.93 


1.00 


0.73 


0.65 


1.00 


0.90 


1.00 


Average 


0.82 


0.91 


0.87 


0.91 


0.97 


0.77 


0.86 


0.85 


0.87 


0.96 



*The Richmond CMHCs are denoted by R-! through R-4; the Tidewater CMHCs 
are denoted by T-l through T-8. 



An input-oriented variable-retums-to-scale (VRS) DEA technique was 
employed in both models to identify the best set of practice patterns. An 
input-oriented model is preferred because CMHCs can change the number 
and type of inputs they use (relaxing the assumption of mandated services by 
the state government), but not the number of clients who visit them for 
treatment. Those CMHCs that did not exhibit efficient practice patterns 
were further analyzed and compared to their peers to inquire under what 
circumstances their practice patterns would mimic the best practice behavior. 

Table 7.6 compares the results from the base and the cone ratio models. In 
the cone ratio model, the efficiency of CMHCs in Richmond is significantly 
less than in the base model. The Tidewater CMHCs’ efficiency scores were 
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reduced during the pre-managed care era; they were significantly higher 
after managed care. A perfect efficiency score is a score of 1.0. Richmond 
had only three perfectly efficient CMHCs in the cone ratio model, as 
compared to seven in the base model, yielding a 57.1% reduction in perfect 
efficiency. On the other hand, the number of instances of perfectly efficient 
CMHCs in Tidewater dropped to 20 in the cone ratio model from 24 in the 
base model, yielding only a 16.7% reduction. This shows the power of the 
cone ratio model, which produces more stringent efficiency outcomes. 

7.5 DISCUSSION 

Over the past decades, researchers have demonstrated differences in the 
amount of resources used for health care in this country due to varying 
patterns of care. The most probable cause for this variation is varying 
provider practice styles. There is a growing concern about the efficiency 
with which health care services are delivered, and about which of the 
varying practice styles are more efficient, and thus more appropriate. This 
chapter has described how we developed and applied a DEA methodology as 
a mechanism to identify the most efficient practice patterns for behavioral 
health care, and to evaluate the variations in resource use associated with 
different variations in practice. The data used in this chapter are useful for 
methodological development. However, the managed care data are from the 
early years of a new data system and there are known concerns regarding the 
data quality. Therefore the results should be used to understand the method, 
but should not be used to judge the actual efficiency of the organizations we 
analyzed. 

Given that caution, several observations can be made. The efficiency score 
of providers in Tidewater increased after the implementation of managed 
care, but the efficiency score of providers in Richmond did not change. The 
differences in the efficiency score of the two regions are statistically 
significant after managed care was implemented. However, the differences 
in inefficiency scores between the two regions are not significant and are 
unchanged after the implementation of managed care. Despite the 
limitations of the data, our analysis demonstrated that providers practiced 
more efficiently - that is, they used fewer resources to produce similar 
outputs - under the managed care payment system. The two main areas that 
account for the efficiency differences between the two regions are the case 
management and non-emergency crisis support services. 



In this study we were limited in the selection of input/output variables 
because of sparse data on certain service variables. Furthermore, there were 
some concerns about the quality of data for input variables with respect to 
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outpatient assessment, outpatient therapy, and outpatient medication 
management. 

We also recognize that the issue of the quality of care raises the question of 
the effectiveness of care by the CMHCs. We assumed that the quality of 
care is the same from all the providers. Further analysis is needed to identify 
how efficiency affects the quality of care. 
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SUMMARY 

Simulation, as it is typically taught, is a rather mechanical process. Students 
are taught to follow a recipe: analyze a system, design a model, convert the 
model to computer code, collect data, verify, validate, and analyze the 
output. In practice, many analysts find that simulation is an odd combination 
of art, science, and marketing. Using this technique appropriately, in any 
industry, involves more than simply following the text book. In our 
experience, health care provides some rather unique challenges for the 
modeler. This chapter describes four different practical examples of using 
simulation to analyze a problem in an acute care hospital. The specific 
examples are not described in detail, since the applications have appeared in 
other publications. The emphasis here is to present some of the obstacles that 
were encountered and the lessons learned. 
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8.1 INTRODUCTION 

Simulation has a vast range of application in health care. Anyone who has 
ever visited a hospital emergency room, undergone surgery, or even visited 
their family doctor will recognize that the provision of health care is a 
complex, stochastic process with an overall structure analogous to a network 
of queues. The heterogeneity of customers in this system, the vast range of 
potential paths through the network, and the time-sensitivity of service make 
health care a “textbook” application for simulation. 

The application of simulation in a health care setting is not always as simple 
and straight forward as one might think from reading the standard texts. In 
this chapter we present four simulation studies and describe lessons learned 
during the projects. The objective of this chapter is not to describe how to 
conduct a simulation study, or to provide all details for the four projects, 
since this material appears elsewhere in the literature. Our goal is to give 
analysts an idea of the issues that arise when an operations research 
technique is applied to a health care setting. 

The projects described in this chapter include: a study to evaluate the link 
between inpatient census and the surgical schedule; a study to evaluate the 
causes of, and solutions to, emergency room wait time in a pediatrics 
hospital; a pharmacy ordering model; and a generalized simulation model for 
an acute care emergency department. In each instance, the problem is 
described, and an overview of the solution methodology is presented along 
with a summary of results. Each section concludes with a summary of the 
lessons learned during the project. 

8.2 EVALUATING THE IMPACT OF THE ELECTIVE SURGERY 
SCHEDULE ON RESOURCE ALLOCATION 

8.2.1 Description of the application 

Nursing, like many regulated health care professions, tends to go through 
human resource availability cycles. The length of time required to fully train 
a doctor or a nurse (four years or longer in many jurisdictions) means that 
decisions made today regarding training spaces in universities and colleges 
only have a noticeable impact five to ten years later. Of course, in the period 
of time between when the plans are made and come to fruition, the demand 
for health care professionals may have changed. This is a common problem 
in almost all medical human resource planning. 

Nursing, as a profession, has a number of unique characteristics that make 
human resource planning more difficult still. The profession is 
disproportionately female, and thus child rearing and family responsibilities 
have an impact on participation in the market place. As with other 
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professions, general economic conditions, quality of work-life issues, and 
random fluctuations in the labor market also affect participation. 

In 1989, Toronto experienced a shortage of qualified nurses. A good 
economy, combined with a rapidly rising housing market in the metropolitan 
Toronto area, caused a net outflow of nursing personnel from downtown to 
suburban institutions. To deal with this problem, nursing leaders from five 
urban Toronto hospitals collaborated to discuss possible ways to attract and 
retain nurses in their institutions. The number one nursing complaint, the 
amount of money paid to nursing staff, was not open to change. The second 
most important issue was the work week; in short, nurses wanted less 
weekend work. 

One of the members of the committee facetiously suggested, “If we did 
surgeries on Monday for people with length of stay of four nights, and did 
only day surgery on Friday, we could empty the wards on the weekend, and 
give nurses more weekends off.” The suggestion was clearly not practical, 
but the idea that we could change the surgical schedule to reduce the 
weekend ward census was thought to be interesting. A project was 
subsequently funded by a grant from the Ontario Ministry of Health and five 
Toronto teaching hospitals: Toronto Hospital for Sick Children, Toronto 
General Hospital, Sunnybrook Health Science Centre, Mount Sinai Hospital 
and Toronto Western Hospital. (Our co-investigator was Professor Linda 
O’Brien-Pallas, Faculty ofNursing, University of Toronto.) 

The study lasted for two years in 1991-93 and involved developing a 
simulation model to use as a decision support tool [1, 2]. The model 
included the operating rooms, the recovery room, intensive care units and 
regular inpatient wards. We were primarily interested in surgical patients 
since 90% of all surgical patients were scheduled in advance and, therefore, 
were somewhat controllable. Conversely, it was felt that nothing could be 
done to control medical patients, since 90% of all medical patients were 
emergency admissions. In all of the hospitals in the study, operating room 
time was assigned on a “block booking” basis. Surgeons received blocks of 
operating room (OR) time (e.g., every Monday morning for three hours) and 
were free to schedule patients in any order within their assigned blocks. 
Typically, elective surgery took place Monday to Friday on the day shift 
with one or two rooms available nights and weekends for emergencies. 

Given this arrangement, we concluded that by changing the weekly OR 
schedule, we could influence workload and census in the rest of the hospital. 
By extension, we argued, it should be possible to determine a schedule that 
would be optimal from a staffing perspective. Furthermore, because we did 
not anticipate making any changes to the number or length of assigned 
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blocks, we assumed that our schedule would have no impact on patients or 
their clinical care. The only impact, as far as we could tell ahead of time, 
would be minor inconvenience to surgeons who might have their block time 
rearranged within the master surgical schedule. 

With these assumptions in mind, we built a simulation model, a database and 
user interface for the simulation. The model included all scheduled surgical 
patients and allowed for emergency patients who could preempt elective 
surgery as well as medical patients competing for intensive care unit (ICU) 
beds. The database included an underlying nursing workload model that 
estimated the total hours in each ward given the patient mix and volume 
flowing through it. If there were no beds available or not enough nursing 
hours when a patient was to be admitted, elective surgery would be canceled. 
The model generally used a first-come-first-served logic for allocating 
scarce resources. A small percentage of patients were also canceled for other 
reasons. 

We used a two-pronged approach to collecting data for the model. We spent 
several months in each hospital analyzing the process to understand how 
patients flowed through the facility, creating process flow charts and 
collecting unique site-specific data. We also took advantage of an existing 
database of discharge records, The Canadian Institute of Health Information 
(CIHI). CIHI is a third-party organization that stores a discharge summary of 
every patient admitted to a hospital in Canada. Institutions in Canada are 
required to contribute data to this source, which is used by hospitals for their 
own internal review as well as by federal and provincial authorities. 

Through the interface, the user was able to set the surgical schedule, make 
adjustments to the surgeons' case mix, specify the number of beds and 
nurses on each ward and change a variety of control parameters. The 
simulation itself was driven on a data trace. Because of the often 
confounding factors relating to age, gender, disease, co-morbidity, treatment, 
and outcome, we reasoned that it would be more practical to dispense with 
the idea of developing and fitting distributions for key simulation variables 
such as length of stay, processing time, etc, since we could not assume 
independent and identically distributed observations. Instead, we decided to 
sample directly from a large list of patients available from hospital discharge 
records. Thus when we needed to “create” a patient in our model, we 
randomly selected a person from this existing list and simply associated all 
of that patient’s demographic, treatment, and outcome data with the 
simulated patient. This mechanism, we felt, would make the simulation easy 
to port between sites and easy to validate. 
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After running random patients through the model for a two-week warm up 
period, we ran ten replications of two weeks. Upon the completion of the 
run, we produced summary statistics on estimated annual patient volumes, 
cancellations, emergencies, and patient census and nursing hours in each 
ward by day of week. 

8.2.2 Challenges encountered 

Timing/proiect cycle time Simulations typically look simple to build; or at 
least they look simple at the start of a project. Our project was originally 
designed for a two-year cycle. The pilot model took approximately 12 
months to complete. Ports to other institutions, which were scheduled to take 
two months, took about four months apiece to complete. Thus, by the end of 
the project, more than four years had elapsed since its inception in 1989. The 
reality of 1993 was much different than the reality of 1989, particularly in 
the health care sector in Canada. While 1989 was the high point in an 
economic cycle, 1993 was a low point. Thus, by the time our program was 
ready, the government was cutting health care budgets, and hospitals were 
laying off nurses! Simply put, weekend workload and quality of work-life 
issues had dropped off the radar screen; people were much more interested 
in holding onto their jobs than getting the weekend off. 

Fortunately, this unexpected turn of events did not detract from the value of 
the project. The model turned out to be very useful as a mechanism to 
balance the use of increasingly tight hospital resources. The simulation 
allowed users to experiment with various allocations of OR time and 
forecast the impact of ward census, nursing workload, ICU beds and 
recovery rooms. Several of the sites used our model to improve their 
operations. 

For example, at the Toronto Hospital for Sick Children, in one ward, the 
census was double on Wednesday night compared to eveiy other day of the 
week. By making a few minor adjustments, we were able to suggest an OR 
schedule that would balance the nursing workload over weekdays. As 
another example, we used the model to look at Christmas closing in 1994 
after the Ontario government asked hospitals to close all elective surgery for 
two weeks as a cost reduction measure. Mount Sinai Hospital asked us to 
complete an analysis of residual demand for OR time and ward space due to 
emergency patients. We used the model to predict the staffing levels that 
would be needed to cover this demand for the two weeks. At Sunnybrook 
Health Sciences Centre the model was used in a number of planning 
scenarios, not to balance nursing workload, but to calculate production limits 
for their cardiovascular surgery program. 
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At one institution we ran into a problem validating our model. The 
simulation model suggested that the OR time currently allocated to the Ear, 
Nose and Throat (ENT) service could accommodate almost twice as many 
patients as they were actually serving. We searched for the cause of the 
discrepancy for several days in the simulation. Ultimately, we discovered 
that ENT had a habit of not always using all of their allocated time. 
Meanwhile, General Surgery was starving for OR time. Whenever ENT did 
not need the allocation, someone in General Surgery was happy to use it. 
The OR managers had not noticed the problem since all of the booked rooms 
were being fully utilized. 

Data collection In any simulation, data collection, verification, and 
validation are major issues. In our experience in health care, no one ever had 
the right data in the form that we needed it. Health care information systems 
are typically designed to meet clinical requirements, not administrative 
needs. The CIHI data was a mixed blessing for our project. The CIHI data 
was universally available for all institutions, in a standard format, and from a 
single source, and was thus easy to access and import into the simulation. 
We did, however, find a number of weaknesses in the database which 
limited its applicability for a simulation study. 

Since the discharge summary is only a summary of what happened to a 
patient, it was not always possible to entirely reconstruct a patient’s process 
through the hospital from their discharge report. For example, a patient 
admitted as a medical patient for treatment of diabetes falls and breaks a hip 
during her hospitalization. If, at discharge the broken hip is considered to 
have contributed more to the patient’s length of stay than the diabetes, the 
patient may then have been labeled as a surgical patient. Without complete 
access to a patient’s record, reconstructing a patient’s length of stay often 
involved some assumptions and some estimation. 

Furthermore, we found that source data sent to CIHI was not always viewed 
by the institutions as reliable. (This is rather surprising given that the 
institutions themselves are responsible for abstracting and summarizing the 
data that is forwarded to CIHI.) Finally, the lag between when data was 
collected, abstracted, and made available to CIHI meant that we typically 
had to use patient abstracts that were at least a year old (and in one instance 
two years old) in the model. This led to a common complaint among 
potential users that the data was “too old” and “not representative of what 
we’re doing now”. 

Every hospital was different Our model was designed to be flexible and to 
provide the ability to answer a wide variety of questions. We wanted to be 
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able to test potential length-of-stay variations by age, disease, sex, etc. 
Furthermore, we wanted the model “portability process” to be as simple as 
possible. Our intention was to develop one generic model and simply move 
this model from place to place, plugging in new patient records and a small 
amount of site-specific data (i.e., the number of wards and the beds on each). 
In practice we found it very difficult to create a single, generic, general- 
purpose patient simulation. Each institution had a unique combination of 
services, programs, and unique “quirks” that made it difficult to directly 
move a model from one location to another. These quirks ranged from 
unique processing rules to arcane details of the physical plant. 

For example, at The Toronto Hospital for Sick Children, the managers 
suspected that an old bank of elevators that frequently broke down 
significantly impacted transportation time! In this case, we included the 
elevator in our model. 

When we initially developed the pilot simulation at The Toronto Hospital for 
Sick Children, we decided to restrict the model to patients who had only one 
surgical procedure. The number of cases of multiple surgeries there was 
quite small. However, Sunny brook is a regional trauma center, and multiple 
procedures are relatively common. So, we needed to modify the Sunnybrook 
model to allow for multiple surgical procedures. 

Stakeholders Getting the buy-in of all stakeholders is always a key 
component to any simulation project. However, when working in a health 
care setting, acceptance by all stakeholder groups is especially important. In 
this particular project, our assumption that the schedule rearrangement 
would be a minor issue turned out to be incorrect. Physicians, as a rule, 
control the creation of the master surgical schedule and guard it jealously. 
Schedule changes are almost never thought to be a matter of minor 
inconvenience. 

In fairness to physicians, the schedule dictates both their income and then- 
work schedule. That is why, in practice, the issue is so controversial that in 
most of the hospitals that we have worked in, the administration simply 
allocates total O.R. time to each service (cardiology, general surgery, 
orthopedics, etc.). The doctors in each service then decide among themselves 
how to allocate specific blocks of time. This solves some of their political 
issues but, as a consequence, the administration relinquishes any control 
over daily work flow balance. 
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8.3 CHILDREN’S HOSPITAL OF EASTERN ONTARIO (CHEO) 

8.3.1 Description of the application 

CHEO is a pediatrics teaching hospital affiliated with the University of 
Ottawa. In 1993, the hospital’s Emergency Department expressed concern 
that up to 20% of patients were forced to wait at least two hours before being 
seen by a physician. The issue was one of quality of service rather than 
quality of care, since all patients are triaged promptly and urgent cases are 
seen right away. Long waits are generally associated with patients having 
“runny noses” and other minor complaints. However, with provincial budget 
cuts looming, managers at CHEO felt it important to maintain good public 
relations. 

The Vice President of Ambulatory Care (VPAC) called us in May 1993. She 
had received eleven process improvement suggestions from staff members. 
Suggestions ranged from overhauling patient flow to making changes to the 
physical layout of treatment rooms. One suggestion called for installing 
video games in the waiting room so patients would not realize how long the 
wait was. While the VPAC thought that many of the suggestions were 
interesting, she needed a mechanism to provide quantitative analysis of the 
options. 

To determine the impact of the various strategies on patient wait time a full- 
scale patient simulation model was developed [3]. The model included all of 
the major patient processes in the emergency department (ED): patient 
arrival, registration, triage, assessment, testing, treatment, and admission or 
discharge processes. Our main evaluation criteria were the average wait time 
and the distribution of these times for each of the four triage categories 
defined by the hospital: emergency, urgent, deferrable, and medical walk in. 

In terms of modeling effort, the simulation itself was relatively simple. 
However, data collection, model validation, and output analysis required 
significant effort. One of the first things we discovered when we started 
collecting data at the hospital was the highly fractured nature of work in the 
ED. CHEO is a teaching hospital. The ED was staffed by one to three 
physicians, called Casualty Officers (COs), five to seven nurses, and a 
number of residents. Normally, each patient was seen by a nurse, a resident, 
and the CO who reviewed the resident’s assessment. On any given shift 
there were ten patient treatment rooms available for use. Patients in these 
rooms were under the care of one CO who might also have had 
responsibility for providing medical education to one or two medical 
residents. 
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We noted immediately that it was extremely rare for any worker (physician, 
nurse, or resident) to complete a work cycle on one patient from start to end 
with no interruptions. More commonly, we observed that nursing and 
physician services were delivered in small discrete batches spaced over a 
fairly long time period. For example, a physician might assess a patient and 
order a test. A nurse might then collect a sample or transport the patient to 
another area. During the time the test was being performed, the physician 
would move on to treat other patients. When the results of the test became 
available, the physician would read the results, interpret them, and order a 
treatment or send the patient home. 

Since a physician had five to ten patients “on the go” at any time, work 
cycles became quite fractured. Indeed, physician work cycles were nothing 
short of chaotic given the additional requirement of also providing medical 
education for students and residents. The physician was, for example, 
required to confirm the resident’s diagnosis, provide him or her with 
background about a disease state or treatment option, and then confirm test 
results, treatment opinions, or patient instructions. Since the casualty officer 
was legally responsible for the patient’s care, no part of the treatment 
process could occur without the permission of the CO. In fact, we later 
found that COs spent about as much time interacting with residents as with 
patients. 

Once we had the model working and validated, we started a designed 
experiment. The factors that we varied included the number of COs on shift, 
the number of residents on shift, and the queuing discipline used to select 
patients from the waiting list. We did not find much of a factor effect for 
queue discipline, but we did note a strong negative effect for the number of 
COs on shift and a strong positive effect for the number of residents on shift. 

As described earlier, resident education was a major component of work in 
the ED. Our experiment indicated that the work created by resident 
education was so great that eliminating all residents from the ED would 
substantially reduce patient waiting time! In fact, our model indicated that 
adding one additional CO, or eliminating all residents, would result in 
approximately the same improvement in waiting time. 

Obviously, eliminating residents from a teaching hospital is not a practical 
alternative, but the results indicated that waiting time could be impacted by a 
number of different scenarios, including different numbers of physicians, 
different shift schedules, and/or the addition of a hospital “walk-in clinic” to 
treat patients with minor injuries. 
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These scenarios led us to one of our more interesting results. As part of our 
plan to rearrange physician schedules, we prepared a simple plot of patient 
arrival times for each day of the week. We compared this to the COs’ shift 
schedule. We found that demand peak (patients) often occurred several 
hours before the staffing peak. For example, on Sundays, the peak patient 
arrival period was between 10 a.m. and 1 p.m., but the peak staffing levels 
were scheduled for 5 p.m. to 7 p.m. Needless to say, the wait times for 
patients arriving in the afternoon were extremely long because a queue had 
been building all day. We were able to make significant improvements 
simply by staggering the doctors’ start times. 

Other major recommendations that came from this project included adding 
an additional four hours of CO time daily to the main ED and implementing 
a fast-track clinic for low-acuity patients. We estimated that these 
improvements would reduce patient wait time by as much as 20%. Although 
the approval process took over a year, the hospital did eventually hire a new 
casualty officer due, in large part, to our analysis. 

8.3.2 Challenges encountered 

Data collection The fractured nature of work in the ED presented a data 
collection problem for us. While good theoretical and practical models of 
nursing workload are available, no corresponding workload standards exist 
for physicians. As a result, it was very difficult to determine, for example, 
the demand for physician time resulting from a patient presenting symptoms 
of asthma. 

Furthermore, the highly fractured nature of work cycles made manual data 
collection a difficult task. For example, much of the work a physician 
performed on a patient’s file was done when the physician was distant from 
the patient (e.g. reading x-rays, interpreting test results, discussing with 
nurses or residents). Thus, measuring physician contact time was not an 
entirely accurate method of determining workload. 

“Job shadowing” also presented some difficulties. For example, the nature of 
patient confidentiality precluded an observer from direct access to many 
types of patient-physician encounters. All in all, identifying accurate 
physician workload was a difficult task. We were, however, able to satisfy 
our data requirements through a combination of statistical work sampling 
and job shadowing. One of the project team members undertook the work 
sampling procedure, which could be performed without the observer 
necessarily having to be in the vicinity of the patient and the physician. In 
addition, the hospital provided us with two nurse instructors, who performed 
a physician job shadow. As clinicians, both physicians and patients accepted 
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the nurses. In the end, we were able to build a reasonable data sample using 
the two techniques. 

In this application as well as most of the others, we discovered that the 
length of time required for any particular task is extremely variable. When 
things get busy in the ED, everyone tends to work a little faster. In 
particular, the casualty officers spend much less time teaching as the demand 
increases. This is not surprising, but it creates some serious modeling 
challenges. One way to avoid this issue, as we did, was to use process times 
based on data collected during the busy times. Our real objective in this 
study revolved around queue length during busy times. As a result, our 
simulated patients were treated faster than the real patients during relatively 
quiet times. 

Time frame A key challenge we faced with this project was finding the time 
to collect data, build the model, and run a reasonable set of scenarios. While 
the project originally was envisioned to be a short term two-week project, in 
the end we spent almost a year working on the model and its various 
components. Building the actual simulation model, as it turned out, was not 
particularly difficult or time-consuming. In fact, it took us about two weeks 
to build. The time consuming aspect of the project was data collection. To 
complete data collection, it was necessary to: identify the data necessary to 
run the simulation, make appropriate simplifying assumptions, define the 
method by which this data should be collected, assign personnel to data 
collection, and then collect the data. 

Once the model was up and running, we found it was not possible to simply 
complete a set of runs, write up the results, and put the project behind us. 
Management at the hospital viewed the model as a useful planning tool. As 
the planning process developed at the hospital, we were asked to run the 
model under different assumptions and scenarios. Coincidentally, as we 
developed and ran these scenarios, our understanding of the ED process 
increased and we were able to point out to management results we felt were 
interesting. This resulted in a collaborative arrangement between 
management and modelers which, while fruitful, extended the project 
completion date. 

8.4 MODELING THE DRUG ORDER ENTRY PROCESS FOR 
INPATIENTS 

8.4.1 Description of the application 

Currently, in the vast majority of hospitals in North America, doctors still 
prescribe medications for hospital inpatients by scribbling notes on paper. In 
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one study published in 1998, Ash, Gorman and Hersh [4] found that fewer 
than 2% of U.S. hospitals had Computerized Physician Order Entry (CPOE) 
completely or partially available and required its use by physicians. The 
initial cost of implementing CPOE is one major obstacle for hospitals. At 
Brigham and Women’s Hospital, the cost of developing and implementing 
CPOE was approximately $1.9 million, with $500,000 in maintenance costs 
per year. Installation of even “off the shelf’ CPOE packages requires a 
significant amount of customization for each hospital and can be very 
expensive [5]. Finally, there may be cultural obstacles to CPOE 
implementation. For example, many physicians resist the idea of ordering 
prescriptions via computer instead of by hand. Although summary results 
were not available, the Leapfrog Group hospital survey [6] indicated that 
most U.S. hospitals are in the process of implementing CPOE. 

On the surface, the manual Medication Administration Process appears quite 
simple. The physician writes a prescription on paper at the bedside and puts 
the order in the patient’s chart. The nurse retrieves the order, transcribes it 
onto the “Medication Administration Record” (MAR) and leaves a copy of 
the order in a tray in the ward to be picked up by pharmacy technicians at 
routine times throughout the day. A pharmacist reviews the order and 
transcribes it into a computer with access to electronic patient records and 
decision support capability. The order is prepared in the pharmacy and 
delivered to the ward. The nurse checks drugs against the MAR and 
administers to the patient. The nurse records the administration on the MAR. 

What is wrong with this picture? The doctor relies on memory/knowledge to 
determine the dose of the medication, to think of patient allergies and to 
remember possible drug interactions. The nurse may not know that an order 
has been written or that the drug has arrived. The multiple transcriptions 
increase the possibility of error and are not value-added work. The physical 
transport of the order wastes time. If the nurse cannot read the order, s/he 
must check with the doctor. If the pharmacist has any questions about 
medication or dosage, s/he must page the nurse and/or the doctor and hold 
the prescription until the order is confirmed. We believe that the process 
could be greatly improved if the doctor entered the order directly into a 
computer, using a handheld device, at the bedside. 

Dr. Glen Geiger is a physician in Internal Medicine at Sunnybrook & 
Women’s College Health Science Centre in Toronto. In 1999, Glen initiated 
a study where he asked doctors and nurses in his service to record process 
times on the drug orders. He discovered that over 25% of the orders were not 
administered within the targeted time frame. Most failures were not even 
close. These were process errors; they do not include cases where patients 




204 OPERATIONS RESEARCH AND HEALTH CARE 



received the wrong drugs. The results of this study were a surprise to 
hospital leaders and continue to surprise health care professionals from other 
areas. Many issues that we discovered at Sunnybrook are common to most 
manual drug order entry processes. 

It is fairly obvious that physician order entry will dramatically reduce cycle 
time for the process and reduce the workload of all parties - with the 
possible exception of the physician. Thus, we needed to convince the doctors 
that the system would dramatically improve the process without significantly 
increasing their workload. We decided to use simulation to quantify the 
potential for process improvement. We believed that it would be an 
important tool for demonstrating the advantages to physicians. 

In the summer of 2001, four students, including three industrial engineers 
and one medical student, were hired to perform a detailed analysis of the 
prescription process. The students spent two months documenting the 
current process through interviews and direct observation. They then 
conducted a two week data collection during which all drug orders for a 
thirty-six bed Internal Medicine ward were tracked to facilitate the creation 
of a simulation model. The results of the detailed tracking confirmed Dr. 
Geiger’s earlier results. In particular, many medication orders were not 
administered to patients in a reasonable amount of time [7]. 

One of the surprising discoveries was that this seemingly simple process was 
actually quite complex. For example, a different process was used when a 
doctor phoned an order in to the nurse as opposed to when the order was 
written. The day and night processes are different because the pharmacy is 
closed at night. At night, instead of placing an order with the pharmacy, 
nursing staff can access a night cupboard for commonly required 
medications. 

There are also communication issues. For instance, pharmacists regularly 
visit the ward and review patient charts. Pharmacists sometimes write a “P” 
on the order. Some of the nurses knew that the “P” meant the pharmacist had 
reviewed the order. Others thought it meant they had “Pulled” the order. 
Also we found some confusion surrounding a physical flag attached to the 
chart. When the doctor writes an order s/he puts the flag up. Unfortunately, 
there is only one flag on each patient chart, and it is used for all orders. 
When multiple orders (e.g. drugs, lab tests, imaging, etc.) are in the file, the 
possibility exists that the nurse will find only the first one and put down the 
flag. Nurses check the complete chart every two hours, but errors sometimes 
occur. One order was in the chart for two days before the students pointed it 
out to the physician. 
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Timeliness of medication orders can be measured in several ways. For 
example, suppose a doctor prescribes medication for a patient at 1 1 a.m. to 
be administered three times a day at 6 a.m., 2 p.m. and 10 p.m. We could 
consider the delivery to be late if it was not back in time for the 2 p.m. dose 
administration. Pharmacokinetic practice says that a dose of medicine can be 
administered up to four hours late (for example, at 6 p.m.), half of the dosage 
interval, and still be on time. From a process perspective, we estimated that a 
prescription should not take longer than two hours to fill. All three measures 
were used in the study for determining whether an order was filled on time. 

8.4.2 Challenges encountered 

Lack of control From a quality perspective, we were quite surprised with the 
apparent lack of control of the prescription process. Since no written 
documentation was available, to determine how the process worked we 
simply asked everyone what he or she did. There was no formal training for 
nurses or doctors. New staff members learned by word of mouth. Virtually 
everyone we spoke with had a different view of the process. Moreover, there 
is no standardization across the hospital; each ward had apparently 
developed its own set of procedures. We attributed this to the perception that 
the process was “simple” and therefore did not require formal documentation 
and training. 

Need for greater modeling detail In the validation of our simulation model, 
we could not get the turnaround times for medication orders in the model to 
match the times that we observed in practice. Initially, the average time in 
the model was 225 minutes, while the true average from the data was 262 or 
16% higher. This seemed odd since the distributions in the model were 
based on statistically fitting the same data. 

A major cause of this discrepancy originated in the pharmacy portion of the 
model. Initially, we had assumed that the pharmacy part of the process 
would be fairly reliable once the orders arrived there, so we chose to model 
the pharmacy as a black box. Since the pharmacy was computerized, we 
expected the process would provide consistent results and that when an order 
was picked up, it would be processed expeditiously and delivered back to the 
ward. We were surprised to discover dramatic variations in turnaround 
times. 

Several months after the initial study, and long after the summer students 
had returned to school, we concluded that we needed to expand the scope of 
the analysis and perform a detailed process analysis of the pharmacy area. 
We discovered several anomalies. There was an 8 a.m. rush in the pharmacy 
to fill all of the orders that had accumulated overnight. The pharmacy 
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processed each ward as a group of orders, and the sequence of the wards 
varied daily. Many long delays occurred when a particular ward was left 
toward the end of the sequence. We also learned that the pharmacy’s 
workforce was highly variable. The pharmacy could not tell us how many 
pharmacists were working at any one time; the assignments varied hour-by- 
hour and day-by-day. Furthermore, it was found that a major complication 
was created by orders requiring clarification. We discovered that over 13% 
of orders required the pharmacist to call the doctor. These orders would be 
set aside temporarily while the pharmacist paged the doctor. We were unable 
to collect meaningful statistics on how long it took to get an answer to a 
page; however, it appears that about 5% of orders took more than three 
hours, and many of these were not resolved in the same day. Moreover, 
many of the pharmacists processed the simpler orders first, and saved 
clarifications until later in the day when they had some spare time. 

Technology implementation As described above, the motivation for the 
simulation was to be able to demonstrate the potential process improvements 
that accrue from using automated physician order entry. Sunnybrook has 
already purchased the software to implement the automated prescription 
entry process. However, the system is still in development and the user 
interface must be customized. We cannot complete the simulation without 
first performing experiments with the interface to determine the distribution 
for access time. We do not expect it to take long, but this is likely to be the 
central measure of success for the physicians. We expect to have a pilot 
version ready by Spring 2004. 

8.5 THE CROWDED STUDY: CAUSES AND RELATIONSHIPS OF 
OVERCROWDING AND WAITING IN DIFFERENT EMERGENCY 
DEPARTMENTS 

8.5.1 Description of the application 

Waiting times and overcrowding in the Emergency Department (ED) have 
become increasingly serious problems over the past several years. In the 
United States, surveys of hospital directors have reported ED overcrowding 
in almost every state [8, 9]; ED overcrowding has also been reported in 
Europe [10]. In most hospitals Emergency Department overcrowding is a 
symptom, rather than a cause, of the problem. For example, overcrowding in 
the province of Ontario in Canada is often attributed to patients who have 
been admitted to hospital, but who are waiting in the ED until a ward bed 
becomes available. Beds are often blocked in the wards because of discharge 
delays (e.g. waiting for test results, waiting for nursing home space, 
rehabilitation beds or home care). Thus, to really understand how the ED 
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functions and why it backs up, it is necessary to develop a detailed process 
analysis specifically focusing on the impact of bed blockers. 

A number of people have done ED simulations in the past, but have 
generally assumed that the processes outside the ED have little direct impact 
on its overall operation. Jun et al [11] present an extensive survey of 
simulation applications in health care. In fact, several simulation studies 
have been conducted to specifically analyze the issue of overcrowding in the 
ED. A priority queuing model was developed in one study to evaluate the 
potential impact of adding a fast-track facility to an emergency department 

[12] . Simulation modeling has also been employed to examine the 
relationship between hospital bed capacity and emergency admissions rates 

[13] , with the finding that bed shortages can be expected when average bed 
occupancy rates exceed 85%. Simulations have been successfully applied to 
investigate the impact in the ED of nurse scheduling on utilization and 
patient length of stay (LOS) [14-16]. Based on these studies, 
recommendations were made for changing policies on staff scheduling, 
triage procedures and nursing responsibilities. Using the simulation model, 
the potential savings from the proposed changes were quantified. 

The study described earlier in this chapter [3] also modeled the flow of 
patients through an ED. For all of the ED simulations mentioned, the patient 
LOS in the ED is assumed to be an exogenous variable, sampled once for 
each patient from a statistical distribution based on historical data. This 
assumption is reasonable given the complexity associated with most 
emergency departments. One can usually construct and validate these 
models quite adequately. However, this method does not allow decision 
makers to investigate the impact of changing non-ED components on the 
overall process flow. For example, if the time required to complete an 
external consult was reduced, or the process for MRIs was improved, what 
impact would that have on wait times in the ED or throughout the entire 
hospital? 

In fact, our analysis suggested that the ED is a very complex entity, referred 
to by some of the doctors on our study team as “organized chaos”. In 2002, a 
team including operations researchers, ED physicians, a statistician and an 
epidemiologist received funding for a two-year study to analyze the detailed 
processes in ten Ontario hospital emergency departments. The Causes and 
Relationships of Overcrowding and Waiting in Different Emergency 
Departments (CROWDED) study was designed to include detailed data to 
promote better allocation decisions for scarce resources such as doctors, 
nurses, and examination rooms. The hospitals were selected to represent a 
cross-section of geography and clientele. Three large teaching hospitals, four 
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community hospitals, and three rural emergency departments were selected 
for inclusion in the study. Two full time research assistants were hired for 
one year to collect data by directly observing patients, doctors and nurses. 
We conducted three trips to each site. There was a pre- visit of 2-3 days to 
study the layout, understand the policies, meet people, and put up posters to 
educate and inform people about the study. Data collection was conducted in 
two separate one week periods at different times of year to get a sense of 
pattern changes over time. The project was designed to construct a generic 
model of an ED that can provide detailed decision support for a wide range 
of process flow issues. 

8.5.2 Challenges encountered 

Doctors are difficult to track As mentioned earlier in the CHEO study, it is 
often difficult to tie physician workload to a specific patient. Doctors consult 
on the phone, read x-rays, view images on-line, chat with nurses and 
residents, as well as performing many other activities; all of these activities 
are done in the course of a patient’s treatment, but rarely happen when the 
physician is proximate to the patient. However, since doctors are probably 
the scarcest ED resource, it is important to determine accurate workload 
information for them. In the CHEO study, we chose to implement a work 
sampling method supplemented with a job shadow provided by a small 
group of nurses. In the CROWDED project, we had significantly more 
resources at our disposal, and we were determined to get very accurate 
workload information. 

Many physician and patient processes could not be observed directly. The 
observers needed to use indirect means of observation, such as consulting 
the patient chart or the “white board” that keeps track of patient progress in 
the ED. In some study sites, we had access to the hospital’s electronic order 
entry/patient tracking systems. This also helped the observers to track the 
patients’ pathways. However, in both paper and electronic documentation, it 
was found that recorded times did not usually reflect the actual time or 
duration of a process. For example, nurses or ward clerks might log an order 
for blood work into the computer at a certain time, but might not collect the 
blood until much later. The time recorded on the chart frequently 
corresponds to the time the order was entered; there is no information about 
the actual start and end time of the process. 

Missing data A related issue we discovered during the course of the project 
was that it was quite difficult to collect complete, accurate flow data on all 
ED patients. The observers estimated that some data was missing for 
approximately 10-15% of patients in the study. 
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For example, critically ill patients may stay in the ED for a long time. The 
CROWDED study did not employ 24 hour observation, so process flow data 
tended to be incomplete. The observers tried, wherever possible, to fill in 
blanks using the patient’s chart, but it was difficult to get good time 
estimates. When patients remain in the ED outside of the period of direct 
observation, the patient pathway through the ED will always have some 
missing data. However, even if some patient data was missing they were 
usually able to record a minimum data set including admission or discharge 
time along with any other charted information. Patients remaining in the ED 
for longer than a single observation shift tended to be admitted patients, or 
patients that required lengthy observation. 

Trauma cases were also difficult to track. Because treatment for trauma 
cases needs to be started immediately, charting is usually performed after the 
fact. Moreover, trauma cases are generally handled behind closed doors. On 
the assumption that it is inappropriate for data collectors to be inside a 
trauma room or that observation may impede patient care, it was decided to 
forego direct observation of trauma cases. Instead the points of time of 
“trauma begins” and “trauma ends” were used as a way to track the many 
processes that could not be directly observed. 

Similarly, acute patients may also receive treatment or undergo tests 
according to medical directives behind closed doors/curtains. In these cases, 
many different processes may be happening. The observers used the charts 
after the fact to determine which processes had occurred. This usually 
provided reasonable results in terms of what happened, but not always when 
it was done. It was sometimes possible to estimate start and/or end times if, 
for example, the observers saw nursing staff gathering up supplies or 
equipment prior to a process, but there was a lot of guesswork. 

In addition to “closed door” treatments carried out by staff on trauma and 
acute patients, another challenge was that processes for many patients 
happen simultaneously. The research assistants were only able to observe the 
processes of one patient at any given time. Sometimes in the case of an acute 
patient such as, for example, a heart attack victim, a team of nurses and 
doctors might perform a series of treatments until the patient is stabilized. To 
capture all these processes required observation of that particular patient for 
an extended period of time. During that time, other things could be 
happening to other patients which were not observed or recorded. 

Layout issues The layout of the ED sometimes created problems for data 
collection. Some EDs were physically spread out which made it difficult to 
see what was happening to a patient or to observe the doctor/nurse treating 
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the patient. In one ED in our study, the physical layout was divided into a 
“major” and a “minor” side. During peak times, doctors would be assigned 
exclusively to one side or the other. However, in off-peak times, one doctor 
would float between both. It became impossible to follow the doctor, the 
nurses and the patients simultaneously in this environment. At another site 
the ED had a number of separate areas. The segregation of the ED made 
tracking difficult for the observers. 

Fast-track clinic Some study sites had an off-site “fast-track” clinic (FTC) 
or an “urgent care” center, separate from the ED. Again, this physical 
separation made it difficult to track patients. 

At one site, the hospital had a fast-track clinic operating from 2pm - 10pm 
on weekdays. The FTC was in a separate area from the ED, but had four 
beds and was staffed by a Nurse Practitioner 1 . During its hours of operation, 
less acute patients came to the ED, saw the triage nurse, were registered by 
ED staff, but then headed to the FTC for treatment. The fact that the FTC is 
external makes it harder to observe the patient flow process. It was tempting 
to simply ignore patients who were sent to a FTC; however, we believe it is 
important to model it as an internal process, using ED resources. In 
particular, one of our model decision variables may be to consider adding 
two FTC physicians, or having some shared resources work in the FTC and 
the regular ED. 

Wait time before triage When we consider the question of ED wait time, 
part of that measure involves patients waiting before triage or registration. 
Predictably, none of the study hospitals tracked or had data on “time before 
triage”. While we believe serious patients are seen immediately and all 
patients are triaged expeditiously, we asked our observers to sit in the 
waiting room and conduct a separate study of “time-to-triage” to determine 
the magnitude of this issue. Observing time-to-triage; however, meant that 
observers could not track patients inside the ED due to layout and sight-line 
issues. The results of our preliminary studies indicated patients frequently 
line up to be triaged, but critically-ill patients were not overly delayed. 

Unplanned critical events In any study, blind luck (good or bad) sometimes 
comes into play. In the CROWDED study the data collection process was 
facilitated by a custom designed PDA application. After the first few site 
visits were completed, the PDA programmer made some minor adjustments 
to the application. Subsequently, after three days of collection following the 



1 A Nurse Practitioner is a Registered Nurse who has taken a graduate level program 
and who can perform many of the functions that are commonly associated with 
doctors. 
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adjustment, the observers discovered that a bug in the revised program 
blocked the transfer of all patient demographic information (name, age, 
gender, ID number, etc.) to the production database. This was a serious issue 
in terms of validation and completion of missing data elements. 

Additionally, a number of unforeseeable public health issues arose during 
the collection process. The ED at one site was closed for several weeks 
because of an outbreak of the Norwalk virus, which interrupted our data 
collection. To make matters worse, after three days of data collection at a 
different site the next week, one of the observers became ill with Norwalk 
like symptoms. She went into voluntary quarantine, and the other observer 
attempted to collect what she could for the remainder of the visit. 

However, the worst setback occurred in March 2003 when Toronto was hit 
with the SARS virus (Severe Acute Respiratory Syndrome). We had to pull 
the observers out of all study hospitals for almost two months. Even 
hospitals distant from Toronto were closed to non-essential personnel. 
Moreover, patient volumes in EDs throughout the province decreased in 
response to patient fear of SARS. Things started to return to normal after a 
short period, but we needed to extend the data collection for two months, 
hire an extra observer, and adopt an aggressive visit schedule to make up for 
lost time. 

Preemption When the ED gets busy, some processes can be preempted by 
more critical needs. When a doctor or nurse comes back to the interrupted 
process, they may have to start the entire process again. For example, while 
a nurse would not interrupt an IV start, he/she might interrupt an assessment. 
When the nurse later returns to complete the assessment, it is usually 
necessary to repeat some elements. One of our team members, an ED 
physician, believes the process can almost grind to a halt when things get 
busy. Physician assessments and nursing assessments are frequently 
interrupted. In our study the observers attempted to track starts and ends for 
all processes, even those that were incomplete, but this was an imperfect 
solution. 

Administrative issues Despite our best efforts, staff members at the 
hospitals were often suspicious about the intentions of our study. It was 
perceived as a study created by the provincial government to streamline the 
costs of health care and reduce employment. Many staff members at 
hospitals believed the study would never be used to benefit health care or 
that the study was misguided. Our research assistants were conscientious in 
assuring the participating hospital staff that we were performing an 
independent study funded by CIHR, and not by their hospital administration 
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or the provincial government, and that we were doing our best to accurately 
represent the processes in their departments so that patient flow could be 
improved without compromising patient care. 

We also found that turnover in key management positions in the ED was a 
factor in our project. During our one year study we had the primary contact 
change at over half of the hospitals. Despite the fact that all of the study 
hospitals had agreed to be part of the project, we often discovered when we 
went to visit a site the current managers had no knowledge of our project, 
and we needed to begin the sales pitch again. In one case, we needed to 
reshuffle our data collection schedule because new managers did not know 
we were coming. 

Security Data security was a very critical component of our study. During 
data collection, our team needed personal information to allow matches 
between paper and electronic hospital records. PDAs used for data collection 
were downloaded daily into the laptop computers and backed-up daily on a 
password protected CD-ROM, which was kept in a safe, secure location. 
Upon return to the lab, the data was copied from the laptop onto a master 
CD-ROM, which was kept in a locked drawer. The data on the laptop was 
then stripped of all personal identifiers (name, address, ID numbers, etc.) to 
ensure patient anonymity. 

8.6 DISCUSSION/CONCLUSION 

Health care is an enormous business offering a wealth of potential 
applications for simulation and other operations research techniques [17]. 
However, health care is a business unlike any other business. In our 
experience the context in which a decision making situation arises has a 
significant impact on the way in which it is solved. Nowhere is this truer 
than in health care. We believe that, because analysts and clinicians speak 
different languages, operations research has made fewer inroads into this 
field than in more traditional industries. However, our experience also 
suggests that OR techniques can be successfully applied in the health care 
setting. The secret is to understand the unique nature of the health care 
business and its impact on models, decision makers, and the development of 
implementable policies. 

In this chapter we have used four simulation projects to highlight the 
practical lessons of applying operations research in health care. Analysts 
should remember that decision making in hospitals is characterized by 
multiple players; seeking the council and incorporating the objectives of all 
decision makers is vital in this environment. In this industry data collection 
systems may not be designed to provide administrative data; collecting data 




USING SIMULATION IN AN ACUTE-CARE HOSPITAL 213 



on patient flow and operational performance metrics may require some 
patience and may extend the project life cycle. Finally, while many 
processes and procedures are fundamentally similar regardless of the 
institution, there are usually enough local quirks to render multi-site 
“cookie-cutter” models infeasible. 

Health care is a fascinating industry to work in. The authors have, over the 
past decade, devoted themselves to applying operations research to health 
care and have enjoyed the experience immensely. It is our desire that the 
lessons we learned will prove useful for others following in this field. 
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ABSTRACT 

Risks to public health arise from infectious disease, exposure to toxic 
substances such as asbestos, environmental insults, and from lifestyle risks 
such as smoking. The risk assessment that must precede healthcare 
interventions or legislation requires probabilistic, statistical and 
computational methodologies. The introduction to this chapter discusses 
how our perception of the risks to public health is changing, and identifies 
some trends in the methodologies used for risk analysis. Risk assessment is 
largely characterised by likelihood-based statistical inference, using point- 
process models of disease intensity as a function of position in space and 
time. Conditional likelihoods such as Cox’s partial likelihood and matched- 
pairs logistic regression are widely used to eliminate confounding variables. 
Two examples of the use of such conditional likelihoods are given. In the 
first, new tests for the space-time clustering of cases characteristic of 
infectious disease are derived and exemplified. In a second application of 
conditional likelihood, some research on risks of Shigella infection to 
schoolchildren arising at school or from playmates is presented. The original 
content of this chapter is two new tests of space-time clustering, and a case- 
study using an unusual conditional likelihood. 

KEYWORDS 

Epidemiology, Risk, Likelihood, Knox test, Shigella 
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9.1 INTRODUCTION 

9.1.1 The major risks to public health today 

Nowadays we are seeking to identify ever smaller risks to public health. To 
this end, huge volumes of data are being made routinely available to the 
epidemiologist, and both statistical methodologies and computing power are 
being pushed to the limit. 

Public health analysts have traditionally been concerned with risks from 
infectious disease and from food poisoning. The familiar story of how in 
1854 John Snow removed the handle of the Broad Street pump in St. 
James’s parish, London, to prevent the spread of cholera exemplifies this. 
More recently, lifestyle risk factors such as smoking were identified for late- 
life diseases such as cancer and heart attacks, where the link between cause 
and effect was harder to establish. In the last 20 years, environmental public 
health has emerged as a major concern. Risks of exposure to toxic materials 
such as lead, asbestos and air pollutants have been much studied, and the 
resulting legislation has greatly ameliorated these hazards [1]. 
Environmental concerns also include the possible existence of disease 
clusters, either of infectious origin or around some ‘environmental insult’ 
such as a power station or toxic waste landfill site. 

In the developed world, there is now great anxiety about the risks posed by 
human activity in general, and from technology in particular. This includes 
genetically modified foods (in Europe), radioactivity from power stations, 
electromagnetic effects from power lines, pollution from toxic waste, global 
warming, and environmental pollution in all its forms. On the other hand, 
many people resolutely continue to smoke despite its clearly proven ill 
effects, concern about cancer from mobile phones has not reduced their use, 
and obesity even among the young is increasing. A large subculture abuses 
hard drugs, with resulting high mortality. Risks posed by the actions of 
others are evidently perceived as more threatening than risks posed by one’s 
own lifestyle [2J. 

Currently, the risk from ELF (extremely low frequency) magnetic fields, 
which has been studied with variable results for over 20 years [3, 4], has 
been firmly established. Only from 2000 onwards have large definitive 
studies and meta-analyses swung the weight of evidence firmly towards the 
existence of a real risk. ELF fields, which result from familiar technology 
widespread throughout the infrastructure of our cities and homes, are now 
known to greatly increase the risk of miscarriages [5, 6], and to increase the 
risk of childhood leukaemia [7]. They may also cause asthma and many 
other chronic illnesses [8]. ELF magnetic fields do not inspire the ‘fear and 
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dread’ that radioactivity or genetically modified foods do, and are currently 
the concern mainly of local pressure groups opposing new power lines or 
cellular telephone masts. The situation here may change, and extensive 
litigation could result. 

The traditional concern of epidemiology, infectious disease, is also re- 
emerging as a major risk factor. Globalisation, travel and population 
movements can result in outbreaks of locally unfamiliar diseases. New 
diseases such as AIDS and variant Creutzfeldt- Jakob Disease have appeared. 
In the near future infectious disease may be set to again take centre stage as 
a risk factor in the developed world, with the growth of multidrug-resistant 
strains of once familiar diseases such as multidrug-resistant tuberculosis [9], 
and with tropical diseases such as malaria increasing their range through 
global warming. 

Healthcare provision requires the estimation of risks to health from all these 
hazards. This chapter is concerned with the probabilistic and statistical 
techniques used. Risk estimation is not, however, the end of the story: it 
must be followed by remedial action. This may require simply giving 
reassurance to the public, issuing guidelines on lifestyle, taking action to 
remove particular environmental ‘insults’, or pressing for changes in the law 
or in public policy. Snow in fact had to argue his case with the Board of 
Guardians of St. James’s parish, and the pump handle was removed by them 
the following day, despite some doubts about the correctness of his case. The 
water board were then directed to improve the quality of the water. A 
modem analogue is the study led by Anto and Sunyer [10], in which asthma 
among residents of Barcelona was linked to soybean dust released when 
soybeans were unloaded to silos from ships in the harbour. This work led to 
the prompt installation of filters to prevent airborne dissemination of 
soybean dust in 1987 [10]. 

The first step in ameliorating health hazards is to demonstrate that they exist. 
The next section addresses this issue. 

9.1.2 Statistical inference and its problems 

Broadly, the aim of inference is to demonstrate an increased risk of disease 
or death arising from a particular risk factor, and then to quantify this risk. 
Several types of statistical analysis are relevant. 

The fast-growing methodology of disease mapping is used to reveal 
geographical variations in risk [11]. Much work has also been devoted to the 
study of disease clusters, either in space, perhaps around a power station, or 
as the space-time clusters characteristic of infectious disease. Cluster alarms 
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are frequently reported to public health authorities, and must be investigated, 
although very often the cluster is the result of random variation, and no 
action need be taken [12, 13]. Such alarms demonstrate the strength of 
public anxiety about health risks posed by human activity. 

Ecological analysis measures explanatory variables and relates them to 
disease. Disease monitoring or surveillance seeks to detect outbreaks of 
disease very early in their progress [14] rather than studying outbreaks 
retrospectively. 

There are many problems with risk assessment. Some are endemic to the 
science or art of statistics itself. For example, we may wish to show a causal 
relationship but can only demonstrate an association. Epidemiologists have 
long wrestled with this problem and have developed stringent criteria for 
showing causality [15]. In practice, epidemiologists may not be able to 
satisfy these criteria. However, the strength of an epidemiological case for 
an association, based on several different studies, eventually becomes so 
strong that rival explanations become increasingly implausible to all but 
cranks or those with vested interests [1]. 

Other problems stem from the limited nature of the available data, such as 
confounding and bias in general. 

Confounding is best introduced by an example. If we wish to examine the 
association between drinking and cancer, smoking is a confounding variable. 
People who drink heavily also tend to smoke, so a naive analysis would 
show a strong association between drinking and cancer. Correcting for 
smoking, by examining the drinking-cancer association separately for 
smokers and nonsmokers, shows the effect of drinking to be small. 

In general, confounding variables (sometimes called nuisance variables in 
the statistical literature) are either sources of risk in their own right, or they 
may augment or potentiate the effect of other variables in which we are 
interested. Such variables need to be included in any model of risk, but may 
be unobserved or completely unknown. Much of the technical content of this 
chapter deals with attempts to overcome this problem. 

Epidemiologists have long been aware of many different types of bias that 
‘can lead to conclusions that are systematically different from the truth'. Last 
[16] cites 27 different types of bias, of which confounding bias is one. They 
arise at all stages of a study from design and initial sample selection through 
interviews (recall bias), modelling and data analysis, to (finally) publication 
bias, where the picture is distorted by the nonpublication of negative or 
uninteresting results. 




222 OPERATIONS RESEARCH AND HEALTH CARE 



The ecological fallacy [16] is a major source of bias. This arises when 
variables are measured over a region and the aggregated variables are used 
to draw conclusions at the individual level. For example, Durkheim [17] 
found that the suicide rate was greater in regions where a greater percentage 
of the population was Protestant. There is an obvious explanation, but the 
data could also be explained if Catholics were more likely to commit suicide 
in regions where they felt beleaguered. Related to this, fitting nonlinear 
models also requires the use of special statistical methods when using 
aggregate level data [18]. 

In general, our lack of knowledge about the biological basis of some risks, 
such as ELF magnetic fields, leads to erroneous estimates of the exposure 
suffered by individuals to the hazard, and hence reduces our estimate of the 
risk posed, perhaps to the point where it does not attain statistical 
significance. , for example, wiring type has been used as a risk marker but 
turns out not to be closely related to risk [4]. Our not knowing which subset 
of the population is at risk also reduces estimated risk, through the dilution 
of the susceptible population with nonsusceptibles. 

In addition to the many biases identified by epidemiologists, statisticians are 
becoming aware of a widespread tendency to understate the size of errors 
and confidence intervals. This bias appears empirically in meta-analyses of 
major trials. One cause is conditioning on the model finally selected. This 
means that one (rightly) chooses the model to best fit one’s data after what 
may be a long process of model fitting and iterative refinement, but then 
(wrongly) acts for purposes of statistical inference as if the model had been 
decided on without any reference to the data. Modem computing power 
augments this problem by making it feasible to fit many models. Naturally, 
the model finally selected fits the data ‘too well’. There is as yet no fully 
satisfactory solution to this difficulty. The problem is only partially 
alleviated by choosing models using such model-choice criteria as the 
corrected Akaike Information Criterion (AICc) [19]. Bayesian model- 
averaging is another, albeit computationally expensive, alternative [20]. 

Experimentally, the possibilities available to epidemiologists are limited. 
There have been a few supervised healthcare interventions where a 
randomised group was encouraged to change lifestyle, but such interventions 
cost millions of dollars [21]. Prospective or cohort studies are relatively bias 
free, and are the ‘gold standard’, given that people cannot generally be 
randomised to adopt different lifestyles as in a clinical trial, and certainly not 
while the effect of doing so is in doubt. Prospective studies may however 
take many years to complete and accumulate only a few cases. They cannot 
include currently unknown risk factors. Retrospective or case-control studies 
can be carried out quickly, are more cost-effective and are widely used [21]. 
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9.1.3 Defin it ions 

Some of the technical terms used in this chapter are now defined. 

Risk refers to the probability that an event such as illness or death will occur. 
Relative risk or risk ratio is the ratio of risk in exposed individuals to risk in 
unexposed individuals. Attributable risk is the risk that could be removed if 
the exposure to the risk factor were eliminated. Last [16] gives full 
definitions of these concepts. 

The hazard of an event (such as death) has a precise meaning in statistics. 
For a hazard h(t), h(t)dt is the probability that the event will occur in the time 
interval (t, t+dt), given that (conditioned on the fact that) it has not yet 
occurred by time t. When more than one event can occur, the intensity p(t) is 
defined such that p(t)dt is again the probability that the event will occur in 
the time interval (t, t+dt), but now conditioned on the previous history of 
such events. Intensity generalises the concept of hazard to repeatable events. 

The likelihood function is the probability or probability density function 
(pdf) of observing the data given the model. Statistical inference is often 
likelihood-based. In particular, there is a class of powerful likelihood-based 
tests called score tests [22] where the test statistic is the derivative of the 
logarithm of the likelihood function with respect to a model parameter of 
interest, evaluated in the limit of ‘no effect’ when the parameter value is 
zero. 

9. 1.4 Current situation and trends in risk analysis 

The methods of risk assessment currently used are mainly parametric models 
of the hazard or of the probability of disease or morbidity, fitted to data 
using likelihood-based methods. The widespread use of likelihood functions 
based on point-process models unifies this field methodologically. 

While some likelihood functions are comparatively simple, such as those 
used in logistic regression, others are more complicated. These latter 
likelihood functions are derived from the theory of counting processes [23], 
and enable models of the intensity of disease as a function of spatial and 
temporal location of susceptible individuals and of ‘environmental insults’ to 
be fitted to data [24]. Thus the stochastic theory of counting processes is the 
probabilistic underpinning of modem risk assessment, and likelihood-based 
methods of inference are its statistical methodology. The ‘executive arm’ is 
the great availability of data and of computing power, with aids such as 
geographical information systems (GISs) replacing the older use of maps. 
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Satellites such as Landsat provide detailed information that can be organized 
by a GIS to facilitate epidemiological studies. 

In statistics generally, there has been for some years a debate between 
Bayesians and frequentists. Frequentists regard the probability of an event as 
a statement about frequencies in many hypothetical trials, while Bayesians 
take a subjective view of probability. This enables Bayesians to write down 
‘prior probabilities’ of events that reflect one’s beliefs before the data are 
examined, and to use Bayes Theorem (which statisticians of all stripes 
accept) to construct a statement of the ‘posterior’ probability of the event 
given the evidence. 

The Bayesian/frequentist debate has rumbled on for years. The 
computationally more expensive Bayesian methods have gained ground in 
the last decade, and some Bayesians have claimed that their approach 
constitutes a new scientific paradigm in the Kuhnian sense. Recently, there 
is some evidence that many statisticians are using an eclectic mixture of 
Bayesian and frequentist methods, in pragmatic attempts to find the best 
solutions to particular problems. Bayesian concepts such as prior probability 
and frequentist concepts such as confidence intervals may be mixed in the 
same article, and this is not now considered such a solecism as it was a few 
years ago. It is becoming ‘horses for courses’ as practitioners seek answers 
to practical problems, and leave the philosophy to take care of itself. 

This trend can also be seen in epidemiology. Bayesian methods such as 
Markov-chain Monte Carlo models are now commonly used in disease 
mapping, where all available information must be synthesised, while 
frequentist methods predominate where it is necessary to present evidence of 
a hypothesis for public debate, independent of prior belief. 

9.1.5 Condi tional I ike l ihood 

Likelihood-based methods of inference are the most powerful, so tweaking 
the ‘plain vanilla’ or unconditional likelihood function in some way is an 
attractive option. 

Many ingenious likelihood-based statistical methods have been developed to 
eliminate confounding variables, such as the use of Cox’s partial likelihood 
[25] and the use of matched pairs in logistic regression [26, 27]. Both of 
these methods rely on conditioning the likelihood function in order to 
remove confounding variables. We need hopefully sacrifice only a little of 
the information in a dataset to get rid of confounding bias. 
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Thus in the matched-pairs case, we model the probability that one particular 
individual was a case, and N others were controls, given that (conditioning 
on the event that) one of the N individuals was a case. This approach enables 
risks due to known risk factors such as exposure to asbestos to be computed, 
while confounding variables are absent because of the matching of cases to 
similar controls [26, 27]. 

In Cox’s partial likelihood approach [25], the hazard that an event such as 
death occurs to an individual is conditioned on death occurring to one of the 
individuals in the ‘risk set’. Applying this conditional likelihood approach to 
the proportional hazards model, in which hazards from different sources 
multiply rather than adding, the ‘baseline hazard’ due to common 
confounding variables cancels out, leaving the dependence on risk factors 
alone in the likelihood function. Often in prolonged cohort studies, the 
baseline hazard of an outcome is expected to vary with time in an unknown 
way, and so the technique of Cox regression or partial likelihood is 
appropriate. 

This chapter illustrates the state of the art of conditional likelihood methods, 
with two such likelihood-based approaches, but using less familiar 
conditional likelihoods. 

The first study (Section 9.2) derived from an attempt to derive the well- 
known Knox test of space-time clustering [28, 29] as a score test. Here the 
hazard of infection is modelled as being elevated if close in space and time 
to a ‘case’ of the disease. Point-process models lead to a score test of 
infectious aetiology, and to estimates of the relative risk, when the 
population density S(x, t) is known. When S is not known, but must be 
imputed from the locations and times of the cases, we are asking a lot of the 
data. By making reasonable assumptions about the dependence of S on space 
and time (that it factorises) it is possible to derive score tests based solely on 
the locations and times of infections. The Knox test can indeed be derived as 
a score test, and so can a ‘corrected’ version of the Knox test, which it is 
hoped may be more powerful than the Knox test itself. 

In the second study (Section 9.3), which was motivated by an outbreak of 
dysentery in the North West of the UK, the risk of contracting dysentery 
( Shigella sonnei) from school toilets is investigated using a conditional 
likelihood related to the ‘weird bootstrap’ [23]. Data on infected individuals 
only are used. Intuitively, each individual acts as his or her own control; they 
will be infected at moments when the risk is high, and their younger selves 
who were uninfected at moments of lower risk play the part of controls. 
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This study is unusual in that rather than asking if there is an infectious 
aetiology, we ask if a particular mode of exposure to an infectious disease 
constitutes a risk factor. This study also models hazards of infection from 
independent sources as adding (as they should) rather than multiplying as 
they do in Cox’s proportional hazard model. By fitting the model to data, we 
can estimate the relative risks of infection corresponding to the various risk 
factors, and also the attributable fractions of infection. 

In this analysis, the device of blocking (matching) was also used. The 
transmission coefficient was assumed to be identical for children using the 
same toilet block, and to vary between toilet blocks. Children using the same 
block are demographically similar. Blocking makes it unnecessary to model 
the way in which the hazard of infection with S. sonnei depends on 
demographic variables. 

9.1.6 Weird likelihoods 

In general, infection is both a result of earlier infections (an effect) and also 
a cause of later infections. Some of the conditional likelihoods used in 
inference are derived by loosening this relationship, and imagining that 
cause and effect can be decoupled. We consider some other pattern of 
infection than the observed one, e.g. that those who were infected might 
have been infected at different epochs (Shigella study), or that infections 
might have occurred at any permutation of the observed space-time 
coordinates (space-time clustering study). We then condition the observed 
likelihood on the more general pattern of infection. However, in constructing 
this more general pattern of infection, the infections are only regarded as 
effects, and not as the originators of new infections. Since Anderson et al. 
[23] have referred to the Monte Carlo generation of random events from the 
general pattern of infection as the ‘weird bootstrap’, in this chapter the 
corresponding likelihoods, for want of a better term, are referred to as ‘weird 
likelihoods’. 



9.2 EXAMPLES OF METHODOLOGIES: IDENTIFYING SPACE- 
TIME CLUSTERS 

Using conditional likelihoods derived from point-process models, methods 
are developed for testing whether cases have an infectious aetiology, and for 
estimating the relative risk arising from proximity to an infecter in space and 
time. 

A problem here is the definition of ‘proximity’. If we do not even yet know 
whether or not a disease has an infectious aetiology, we are unlikely to know 




ESTIMATING RISKS TO PUBLIC HEALTH 227 



the ‘critical distances' defining spatial and temporal proximity. Use of the 
likelihood framework shows the way to a solution. 

The likelihood function for n events arising from a point process occurring 
in space and time may be written as 

1 = LI {p(*i> h )$(*/ >*i )} e*P (- J p(x,t)s(x,t)dxdt), ( I ) 

i=l 

where p is the intensity of the point process, S the population density, and 
t { are the space and time coordinates of the / ,h infection, and the integral runs 
over a region of space (usually a surface) and time. 

This likelihood function for a point process can be derived by writing the 
likelihood as a product of conditional probabilities for each small time step, 
where if an infection occurs, the conditional probability is p(u)du, and if no 
infection occurs, the probability is 1 -p(u)du. Equation (1) follows by taking 
the exponential term as the product-limit. This expression is well known [25, 
26, 30]. 

A simple model of a point process with infection is to model p as 

pM = p|l + a I f(x-xj)g[l -tj) (2) 

where is an unknown rate constant, and infection is increased by a factor 
l+anear an infected individual, i.e., the relative risk from proximity to an 
infecter is \+a. 

The definitions of / and g are: 




1 if\x,-Xj\<d x _ 

0 otherwise 




1 

0 otherwise 



where d x and d, are space and time critical distances. We also have f n = g u = 
0, because a case cannot cause itself. Other definitions of / and g can be 
made, and much of the methodology will still carry through. 
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The rate constant /} is unknown and is a nuisance parameter. It may be 
removed in three ways, as a marginal likelihood obtained by integrating fj 
over its prior distribution with pdf 1//3, by estimating f) and plugging the 
estimate back in to obtain a profile likelihood, or by conditioning the 
likelihood on the ‘weird’ likelihood 



L„ = (| p{x,t)S(x.t)dxdt f exp(- J p(x,t)S(x,t)dxdl)/n! (3) 



that n individuals are infected. In any case, to a constant factor, we have the 
conditional likelihood 



. _ ; Il/Li f 1 + =i fijSij )$ ( x i > l i )} 

C = ”' (v + «z U»kY ' 



(4) 



where N = \S{x,t)dxdt and N k = JS(x, t]f[x-x k )g{ t-t k )dxdt . 

Assuming that the population density S(x,t) is known, equation (4) can be 
used to estimate the relative risk 1+a (e.g., as a maximum-likelihood 
estimate) and to derive confidence limits on a. The starting point is the log- 
likelihood from equation (4), 



* = Vog\ 

/=! 



1 + «I fjgj 

H 



-nlog\\ + a^N k /Nj 



(5) 



Maximizing this with respect to a gives an estimate of the relative risk 1 +a. 
A large-sample 95% confidence interval can be derived as the range of a for 
which £ exceeds t max -(l. 96 ) 2 /2. Equation (5) can also be used to derive 
a score test of i/ 0 that a = 0. The score statistic is 



* = dtjd a| a =o = S 2 fijgij ~ n I N k /N , 

i = l )=1 



which is the number of close pairs of cases relative to the number expected if 
infections occur randomly. The variance of the score is estimated as 



Var(s)=-d 2 /da 2 \ a=0 =i 



1=1 



i/ygvl-nklMltJ- 



( 6 ) 
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Hence a standardised score z = dt/do\ a =o/{-d 2 l/da 2 \ a= ( ) f 2 can be 
calculated, which is (for very large samples) normally distributed. Its 
reference distribution is better found by Monte Carlo simulation, i.e. by 
simulating random coordinates and times of infections a large number N 
(e.g., 10,000) times from the trivariate pdf S(x,t)ljS(x 9 t)dxdt and recalculating 
values Z of the standardised score. The p- value of the test is read off as the 
proportion of simulated Z values that exceed z.. 

The space and time critical parameters d x and d, are usually unknown. The 
test proposed here and the Knox test therefore have the unusual property of 
having nuisance parameters present only under H\. The asymptotic theory of 
such tests is given in [31, 32]. Here however exact /7-values are found using 
the Monte Carlo approach. The following argument derives the form of the 
test statistic for use with unknown critical parameters that gives the most 
powerful test. 

By the Neyman-Pearson Lemma (e.g., [33]), particularising to the situation 
described here, asymptotically most powerful tests must be based on the 
difference of log-likelihoods 

((a .d x .d,)-l ot (7) 

A A 

where CL is the maximum-likelihood estimate (MLE) of CL , and d x ,d, 
denote the MLE of the nuisance parameters d x ,d r The log-likelihood £ 0 
under Ho when (X = 0 does not depend on d xi d r 

From, e.g., [33], the asymptotic approximation of expression (7) is: 



£(a,d x M t )~ z sa-(\/ 2 yar(s)a 2 . 


(8) 


Maximizing this expression for or and d xi d, yields 




t(a,d x ,d,)-e 0 =( 1/2) sup z 2 . 


(9) 


d x .d, 




z = s/iVars ) ! / 2 . 


(10) 



As we do not wish to reject Ho if sup z < 0 (this is a 1 -tailed test), supz 2 is 
replaced by sup z. . Tests based on the statistic sup z. are therefore, for large 
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samples, the most powerful possible. In practice, sup z is computed both for 
the sample, and for N simulations, and the /7-value found as the proportion of 
simulations for which SupZ > SUpz . We now have a test for space-time 
clustering when the population density is known as a function of space and 
time, but the critical distances are unknown. 

The population density S(x,t) may however not be known. The Knox test 
[28, 29] has been widely used in this situation. Here the test statistic is 
simply the number of close pairs. Its reference distribution is best found by 
permuting the labels of either space or time, thus making the Knox test into a 
permutation test. Such permutation is justified if the population density S(x,t) 
= A(x)B(t). Here the population grows or decays uniformly throughout the 
region. The Knox test is known to give spurious results if this assumption is 
not met, for example if a population migration takes place. Kulldorff [34] 
suggests that the Knox test be tried, and only if it gives a significant result is 
there the need to acquire population data and to carry out any more 
sophisticated test. 

The Knox test can be derived as a score test using a ‘weird’ likelihood, 
under the assumption that the population density factorises. Restricting the 
‘weird’ likelihood in equation (3) to cases whose space or time coordinates 
are permutations of the observed cases, and conditioning the likelihood in 
equation (1) on it, the likelihood becomes 

'Lpems numera,or 

where the terms in S have cancelled out. 

It is easy to see that the Knox test follows as the score test based on the 
statistic dt/dct | a=0 from this likelihood function, where l w = logL^ . This 
formulation of the test enables the relative risk 1 +« to be estimated as the 
MLE of 1+a, and for a confidence interval on a to be estimated. For 
computational purposes, the denominator would be replaced by a large 
number of randomly chosen permutations. 

Besides the large-sample confidence interval based on the Normal limit of 
the likelihood function, an exact confidence interval for CX can be computed 
by exploiting the relationship between a statistical test and a confidence 
interval, that the confidence interval is the set of values Ob of a for which the 
hypothesis a = (Xo can not be rejected. The score test can be done for any 
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value (Xo and its p-value found as p = Sperms where t w >s L w * where 5 is 
observed score. 

Another benefit of the formulation of the Knox test as a score test is that it 
makes clear how covariates such as age and gender should be treated. 
Suppose that there are m classes of individuals. The risk (X now becomes a 
‘who was infected by whom’ matrix, where a*/ is the risk that a class / 
individual infected a class k individual. The population density S now also 
sprouts a class suffix. The likelihood becomes 




where c(i) denotes the class of the ith individual. 

It is interesting to see what happens if we do not condition on a ‘weird’ 
likelihood. On conditioning the likelihood in equation (4) on the event that 
the cases are drawn from some permutation of space-time labels of the actual 
cases, we obtain the conditional likelihood 

1 perms numerator 

The corresponding score is now not quite the Knox statistic, as the 
exponential term does not cancel. The expression £2=1 JV* would only be 

invariant under permutation if either A or B were a constant. The exponential 
term gives the probability that no other cases were infected besides the n 
cases observed, and under H\ that a > 0 this varies between permutations. 
Essentially we have the score statistic from equation (6), to be evaluated by 
permuting space or time labels. The second term is the expectation of the 
first, and varies much less strongly with permutation. 

Estimating S=AB as proportional to a product of sums of delta-functions 

sM= !§(*-*)£ s(r- 0 ) 

1*1 j= 1 



at the space and time coordinates of the observed cases gives a score statistic 
which after a slight adjustment becomes the modified Knox statistic 




232 OPERATIONS RESEARCH AND HEALTH CARE 



T=llfjgij- 

i*l j = 1 



1 

n - 1 



n n n 

XXX fkjgj ■ 

k=\l=lj=\ 



( 11 ) 



The first term is the Knox statistic, twice the number of pairs of cases that 
are close both in space and time. The second term is the expectation of the 
first under // 0 . This follows because f k j is the number of cases close in 

space to the y'th case. A fraction YJi=\glj /( n — l) of cases are close in time to 
the y'th case, so if space and time distributions of cases are independent, we 

n n 

should expect the number of close cases to the y'th to be £ Yjfkj glj j\ n ” V’ 

*= 1/=1 



n n n 

and twice the total number of close cases to be £ X E fkj glj / \ n ~ U • 

k=\l=\ y'=l 



The proposed test is thought likely to be more powerful than the unmodified 
Knox test, following an argument from Lehmann [35]. A test statistic such 
as the number of close pairs can only take integer values, whereas the 
statistic in equation (11) breaks this degeneracy and so can take many more 
values. Consider a set of simulated values of the test statistic. In moving 
from the Knox test to the proposed test, the fraction of simulated values 
greater than or equal to the sample value will decrease, making the p-value 
of the test smaller and the test more powerful, as all the previously lumped 
values now span a range. 

Baker [36] gave a version of the Knox test for use when time and space 
critical distances are unknown. The same procedure follows for the score 
statistic in equation (11). The test statistic sup z is evaluated by evaluating z 
at a large number of grid points that cover all ‘reasonable’ values of the 
space and time critical distances. Its reference distribution under Ho is found 
by treating a large number of permuted datasets identically, i.e. we find sup 
Z for each permuted dataset. The variance of T is needed for this, and can be 
found from the simulations, but it is more convenient to compute it using a 
formula. 

The derivation of this follows the method for calculation of permutational 
variances of Knox-like statistics set forth very clearly by [37]. The 
calculations are straightforward but somewhat tedious, and so only the result 
is quoted here. Its correctness has been verified by simulation studies. 



0 = X”=l fj ■ r = X"=i»i . S; = X ”=1 gij and S = 1"=^, 
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Var(s) = 



irs 

w(«-l) n(n-\)(n-2) 




(12) 



As an example of the use of this modified test, McHardy et al. [38] gave grid 
coordinates of homes and year of onset of 22 cases of Kaposi’s sarcoma, in 
the West Nile district of Uganda. The authors state that computer analysis 
using the Knox [28] and Barton and David [39] techniques showed no 
significant space-time clustering. Time criteria varied from less than 1 to less 
than 5 years, and space criteria up to 24 kilometers (km). There are two pairs 
of cases for which onset was in the same year, and who lived within 2 km of 
each other. 

A reanalysis using a Knox test with d x = 2 km, d, = 0 months gave 
p = 0.042, with 50,000 simulations. The modified test described here gave 
p = 0.0176. The distribution of the test statistic T from equation (11) is 
shown in Figure 9.1. 

Using a grid of five time values from 0 to 4 months, and 16 space values 
from 0 to 15 km described in Baker [36] gave a significance level of 

A A 

p = 0.109, with d, =0 years, and d x = 0 km. The corresponding test 
proposed here gave p = 0.0609 over the same grid. The ratio of observed 
to expected counts was 12.8. 

These tests look promising, and it is hoped that they will be applied by 
practitioners. A FORTRAN95 program for the modified Knox test and 
details of the algorithms used are available from the author. 

9.3 EXAMPLES OF METHODOLOGIES: THE ‘WEIRD 
BOOTSTRAP’ AND SHIGELLA SONNEI 

The reported incidence of sonnei dysentery increased throughout the UK in 
the early 1990’s, especially in the North-West, and was particularly high in 
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Figure 9.1 The distribution of the test statistic 7 from equation (11) 




Salford. A retrospective study was carried out of dysentery transmission 
between children attending four Salford schools, to address the question of 
where infection was occurring. 

A model was formulated in which the hazard of infection was a function of 
risk factors indicative of high contact rates, such as infecter and contact 
living close to each other, attending the same school, etc. Relative risks and 
attributable fractions were estimated by the method of maximum likelihood. 
This was carried out numerically, using a FORTRAN95 program written by 
the author, and which in turn used the NAG [40] function minimiser 
E04UCF. 

The analysis showed that transmission of dysentery from contact with 
infected school toilets was not a major cause of infection in schools 
implementing PHLS guidelines [41], and neither was contact in the 
classroom. The analysis supported the view that closing schools down 
during dysentery outbreaks was not a useful control measure. 
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9.3. 1 The shigella problem 

Shigella sonnei has been responsible for over 90% of isolates of bacillary 
dysentery in the UK in recent years. Young children aged 5-8 years are at 
greatest risk. A typical case presents with diarrhoea lasting 2-3 days after an 
incubation period of 1-3 days. Abdominal cramps, vomiting and fever may 
occur. Susceptibility is general and immunity following infection is short- 
lived. Infection is transmitted by the faecal/oral route from human cases or 
asymptomatic excreters. 

In 1991, there were approximately 9,200 laboratory reports and/or 
notifications of S. sonnei in the UK, representing the highest rates recorded 
for twenty years. There were nearly 17,000 in 1992, and thereafter the 
annual total has fallen steadily from below 7,000 to under 2,000 today. This 
study is ofthe 1991-1992 epidemic. 

The isolation rate of S. sonnei in the North West Regional Health Authority 
rose sharply from 6.2/100,000 in 1990 to 51.3/100,000 in 1991. The 
outbreak commenced in September 1990. The total number of isolates 
reported to North Western Public Health Laboratory (NWPHL) in the period 
1990-92 was 940. The peak annual isolation rate (March 1991 to March 
1992) was 230 per 100,000. Of the 940 cases, 15.1% of isolates were from 
0-2 year olds; 61.4% of isolates were from 3-11 year olds; and 22.7% of 
isolates were from people aged 12 and over. 

Children aged 3-11 years attending 54 out of 101 primary and nursery 
schools in Salford were involved. The highest number of affected children 
attending a particular school was 49 (isolation rate 19%). More than five 
children were affected in nine schools, and the isolation rate exceeded 5% in 
six schools. 

Ascertainment and control of cases was undertaken by Salford 
Environmental Health Department (EHD) in close liaison with schools. 
Because of the protracted and serious nature of the outbreak, head-teachers 
of schools in affected areas were asked to report cases of diarrhoea and/or 
unexpectedly large numbers of absent pupils to the EHD. If two or more 
cases of sonnei dysentery were confirmed within one week, control 
measures were applied. 

School infection control policy consisted of: emphasising the importance of 
handwashing to teachers; inspecting toilets to verify reasonable hygienic 
standards and the presence of adequate warm water, soap and disposable 
towels; exclusion of pupils for 14 days following the onset of symptoms; 
and, thrice daily, thorough cleansing of toilets with disinfectant. 
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Figure 9.2 shows the four-weekly incidence of confirmed cases of sonnei 
dysentery in 3-1 1 year old Salford residents from January 1990 to December 



1992. 



Figure 9.2 Confirmed cases of sonnei dysentery in the Salford 

epidemic 
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9.3.2 Delta collection 

The study period was defined retrospectively from January 1, 1992 to July 
31, 1992. Figure 9.2 shows that this period forms a discrete episode within 
the epidemic curve of the general Salford outbreak. 

The study population was defined using Salford EHD outbreak investigation 
records which gave the school attended by all cases and contacts. It 
consisted of children from the four Salford primary schools with the highest 
numbers of isolates during the study period. 

EHD records contained name, address, age and date of onset of symptoms of 
affected individuals. Sex was imputed from first name, and grid references 
from home addresses. It was thus possible to calculate the distance between 
the homes of any two children. Schools provided details of class 
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membership and class size during the study period for the entire study 
population. These data were collected in September 1992. 

Two of the schools studied were only a few kilometers apart. To examine the 
effect of infections arising from children not attending the same school, data 
were extracted from the EHD records on all children up to age 1 1 living in 
that area who were infected during the study period. 

All four schools had several toilet blocks. In many instances, a particular 
block was said by school staff to be used exclusively by members of a 
particular group of classes. Class membership was therefore associated with 
a unique toilet block. Staff were closely questioned about the possibility of 
children using toilet blocks other than the one associated with their class 
(especially during playtime when they were more mobile within the school 
premises). Where indiscriminate usage was thought to occur, individuals 
were not assigned to a toilet block; these individuals comprised only 4% of 
the total. 

Notification to the EHD of cases of diarrhoea in schools continued 
throughout outbreaks. It is likely therefore that the majority of affected 
children came to the attention of the EHD. The policy of Salford EHD 
during the study period was to visit and obtain faeces samples from all 
contacts of known cases of S. sonnei dysentery. 

The index case in a household or series of community contacts was taken to 
be the first case notified to the EHD. On microbiological investigation 
however, some of the contacts were found to be infected prior to the index 
case or to be co-primary cases. The EHD outbreak investigation records do 
not state whether cases were incident or revealed through contact tracing. 
The accuracy of onset dates of cases revealed through contract tracing 
mainly relies on the memory of parents (‘recall bias’). Most onset dates are 
likely to be accurate to within two days, however. 

9.3.3 Risk factors for dysentery transmission 

The following risk factors were modelled: 

1. Toilet block: This is a plausible source of infection. 

2. Pupils’ age: This may affect transmission of S. sonnei in several ways. 
First, children of 5-6 years are at greatest risk of infection. Also, children 
might tend to be at greater risk of infection from members of their peer 
group, who would naturally tend to be of similar age. 




238 OPERATIONS RESEARCH AND HEALTH CARE 



In the analysis, the device of blocking copes with the first effect, and the 
second effect is modelled explicitly. 

3. Pupils’ sex: Thomas and Tillet [42] show that there are slightly more 
isolates of S. sonnei from male primary school children, a result also 
seen here (male-female ratio 0.55). Most toilet blocks were single sex. 
Males are therefore more likely to use the same toilet block and more 
likely to have an early onset date, if a significant fraction are infected. It 
is also possible that the sexes segregate during play, so that for example 
males are more likely to acquire infection from males than from females. 

The device of blocking prevented the wrong imputation of this sex-based 
effect to infection acquired from school toilets. 

4. Pupils’ class-membership: Person-to-person transmission or 

environmental contamination within the classroom could lead to 
increased transmission rates. 

5. Infection from siblings: This is known to be common. 

6. The infection of pupils from school friends who are not classmates: 
Person-to-person transmission between friends who use the same toilet 
block could occur during break-time or after school hours. Here an 
enhanced transmission between peers could be wrongly ascribed to use 
of the same toilet block. This effect means that estimates of infection 
arising from use of a common toilet block may overstate the amount of 
infection acquired from contamination of the toilet. 

7. Pupils’ attendance at a particular school. 

8. Proximity: The infection of pupils from contact with an infecter living 
nearby, e.g. by playing together outside school hours. 

It can be seen why it is important to model the effect of all these factors 
simultaneously. For example, many UK children are bom within two years 
of a sibling, and if siblings tend to use the same toilet block, and the sibling 
effect were not modelled, we would erroneously ascribe sibling infection to 
infection from contaminated toilets. Modelling the main effects described 
above reduces error from confounding bias. 

9.3.4 Model assumptions 

The model assumptions are: 




ESTIMATING RISKS TO PUBLIC HEALTH 239 



1. From DuPont [43], upon infection with S. sonnei there is a mean latent 
period of 1.4 days before the recipient becomes infectious, followed by 
an infectious period of 2.6 days. Onset dates were known only to the 
nearest day, and so it was considered that infection could have occurred 
at any time up to 4 days prior to onset. It is in theory possible to 
determine these parameters from the data; however with the values 
quoted, the results were not sensitive to changes in the parameter values. 

2. After onset of the disease, children are removed from school until 
recovered. Hence it was assumed that after the onset date, they are not a 
source of infection to children attending school. 

3. On return to the school, children are immune for the duration of the 
epidemic. Hence, after the onset date they cannot be reinfected. This 
assumption is reasonable, as Keusch and Bennish [44] conclude that for 
several months after infection there is immunity to Shigella reinfections 
with the original serotype. 

4. Transmission is homogeneous throughout the whole population, with the 
exception of transmission attributable to risk factors modelled. 
Interaction terms between these effects were also studied. 

5. The relative risk of infection is the same for each block, and similarly 
for each class, etc. 

Tables 9.1 and 9.2 and Figure 9.2 give a general picture of the four schools 
in the study. 

Table 9. 1 gives some demographic details of cases in the four schools, and 
shows that the mean age and standard deviation of ages of pupils from 
whom S. sonnei was isolated are comparable between schools. Table 9.2 
shows the age ranges and number of classes using toilet blocks. There was 
an average of 30 pupils per class. 

9.3.5 Statistical modeling 

In this section we derive maximum-likelihood estimates and standard errors 
of the relative risks of infection attributable to various risk factors and of the 
corresponding attributable fractions of infections. 
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Table 9.1 The numbers and age distribution of the study population 



School 


Staff 


Pupils 


Siblings 


Mean 
Age of 
Pupils 


Standard 
Deviation of Age 
of Pupils 


1 


3 


49 


10 


6.38 


1.93 


2 


0 


31 


3 


5.77 


2.12 


3 


0 


33 


4 


6.94 


2.22 


4 


0 


56 


13 


5.78 


1.81 


Totals 


3 


169 


30 


6.19 


2.05 



Table 9.2 Usage of toilet blocks by classes in the four schools 

studied 



School 


Toilet Block 


Sex 


Age Range 


Number of Classes 


1 


A1 


Mixed 


3-4 


1 




A2 


Mixed 


4-7 


4 




A3 


Mixed 


7-11 


5 


2 


B1 


Mixed 


3-4 


1 




B2 


Male 


5-11 


9 




B3 


Female 


5-11 


9 


3 


Cl 


Male 


5-8 


2 




C2 


Female 


6-8 


2 




C3 


Male 


3-9 


6 




C4 


Female 


3-11 


8 


4 


D1 


Male 


3-4 


1 




D2 


Female 


2-4 


1 




D3 


Male 


4-5 


6 




D4 


Female 


5-7 


4 




D5 


Male 


8-10 


4 




D6 


Female 


8-11 


4 
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Risk markers and relative risks 

Let the onset of morbidity for the N infected individuals occur at successive 
epochs /, These onset times start at the beginning of the infectious 
period. We are interested in the likelihood of observing some subset n of 
these (those who attend one of the four schools studied), and in the others 
purely because of their role as causative events of infection. Each individual 
can have any of q types of spatial proximity to infected individuals. Define 
risk markers 



1 if kth person is in type m proximity to the jth person 
0 otherwise 

For example, proximity to the house of an infected child is a risk marker, 
because children play together, and we can take / = 1 if the Euclidean 
distance between their houses is less than some distance d. Define also 

/ \ fl if the jth individual could have infected jth 

8(lk ^ = \o otherwise 

More precisely, for# = 1 we require that t k — tj > A . Here A was taken as 4 
days. When two infections occur simultaneously, it is not known who 
infected whom, and so we set g(fy, //)= gif j, /* )= 1/2 . The problem of tied 
onset dates is discussed in detail later. 

The hazard of infection of the kth individual at some epoch u is written 

%(«)= 

where /J is the (unknown) transmission coefficient for dysentery, and the 
reduced hazard ti is modelled using the linear model: 

h[(u)=ig{t k A^ia m fA (13) 

7=1 l ™=1 J 

where the terms cq+1, 0^+1 are relative risks, unity if the corresponding 

risk marker (closeness to an infecter) has no effect. 




This model assumes that infected individuals cause infection independently. 
The values of / and g will depend on one or more of several critical values 
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such as the distance d, used to define proximity, and the duration of infection 
A. 



9.3.6 The likelihood function 

Following the method of derivation of likelihood functions outlined earlier, 
the likelihood function of observing infections at epochs t l ...t n is thus 




where P is the probability that no other individuals in the population are 
infected. Once the kt h person becomes infected, data are retrospectively 
available over the period T* to f*. The data are for the point when symptoms 
occur. This will be true for cohort and retrospective cohort or ‘trohoc’ 
studies, and for case-control studies. Here r k will be some epoch prior to the 
start of the epidemic. 

Similarly, the ‘weird’ probability that the observed number n of infections 
happen to the particular individuals who were observed to be infected, but at 
any epoch within the period of observation, is 




The conditional likelihood L c =LlL n is 



L . ynsuM*) 

' c 



(16) 



The nuisance variables f} and P have disappeared. One can also obtain 
equation (16) as a profile likelihood sup$L n {$), by estimating /3 from 
equation (14) and substituting it back into equation (14). The estimate of/Jis 




and n\ is replaced by its large-sample approximation exp(— tl). Yet 

again, equation (16) can be derived by giving an (improper) prior 
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distribution with pdf 1//3. On integrating over fi from zero to infinity, we 
regain equation (16), but with (n -1)! replacing n\. The constant is of course 
irrelevant for the subsequent inference. 

Here L c is the pdf that infections were experienced at the observed epochs 
rather than at some other possible epoch. In the denominator of equation 
(16) although infections as outcomes can occur at any epoch in the range, the 
infections as causative events are fixed at their observed epochs. Cause and 
effect have been decoupled, and we imagine infections occurring at new 
epochs, and their associated infecters still fixed as they were. 

The logic leading to the partial likelihood (Cox regression) method [25] is 
very similar. There, one can condition the likelihood so as to remove the 
unknown function (in our notation) p(t). The conditional likelihood used 
here is simpler, and loses less of the information from the likelihood in 
equation (15). It is appropriate for the short period of observation considered 
here. 

We now develop equation (16). Evaluating the integrals, taking logs, and 
discarding the n\ factor, we obtain £ = log L c as 




n N 

log I I 0 ),j 
'=U=i 



(17) 

where 

=jt l 1 (> 8 ) 

and 

n n / n n 

Pm = ft? /IS CD/,. 

/=!/=! / HH 



Since g is a 0-1 function, C0 /y = A, the duration of infection of the y'th 

infecter, unless the /th infection occurs before the infection period has 
finished, or the interval commences after it has begun to operate. It is zero if 
the y'th infection occurred too late to have caused the /th infection. Thus p m , 
which can be calculated from the data, is the probability that a random cause 
is ‘close’ in attribute-space to a random infection that it preceded, weighting 
causes by their periods of operation. 
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Maximising / with respect to a can be carried out numerically, and yields 
MLEs of relative risks. Analytically, 



where 



and 



n 

dl/da r = \/a r x n/a r x F r , 

k = l 

F = I}=1 gftt'ljkr/v 

Fr=p r a r /(\+J J ‘ l m=l p m a m \ 



Setting dL/da r = 0 gives 



V«Z Fb.=F r . 

*= I 



(19) 



(20) 



The Ffrare estimated fractions of the £th infection due to the rth risk 
marker, or estimated attributable fractions. Equation (20) is defining the 
estimated attributable fraction due to the rth risk marker as a sample mean, 
and the MLE of relative risk is the solution of the set of equations (19). 

The covariance matrix for a may be estimated as the inverse of 

-d 2 L/da r da s \ a=([ , 

which is trivially calculable. Since the score 

dL/da r = (n/a r )x{ £F fe ./n-/yl, 



and the covariance matrix of the score [25] is 

-d 2 L/da r da s \ a=& , 



the MLE of the covariance matrix of the attributable fraction F r is 
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-(a Asfn 2 ^ 2 l/3a f a 5 | a=& = | £ -F s ) 

[k=l 



n\ 



which turns out to be just the usual formula for the sample variance. 

To obtain accurate confidence intervals on relative risks, for large samples 
one can plot the profile likelihood (all other parameters except the ‘a’ of 
interest are varied to maximise the likelihood). The 95% confidence limit is 
at the point where twice the log-likelihood has decreased by (1.96) 2 = 3.92. 
Confidence limits on the attributable fraction of infections from toilet blocks 
are shown in Figure 9.3. 

Figure 9.3 Confidence limits on the attributable fraction of infections 

from toilet blocks 




AttribuUbW p«rccnug« 



For small samples, the simplest (and most computer-intensive) method is to 
carry out bootstrap resampling to obtain a series of MLEs, (X . They are 
sorted, and the confidence interval taken between percentiles of the resulting 
sample distribution. These methods simultaneously give confidence intervals 
on the attributable fractions, i.e. by calculating these latter and sorting them, 
and proceeding as before. 

When individuals differ in susceptibility because of age or other 
demographic variables, so that ft varies, similar individuals may be grouped 
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into blocks, and conditional likelihood found for each block. The 
total conditional likelihood is L c = • 1° this study, toilet blocks were 

taken as the unit of blocking, as one block is only used by children of similar 
age, and usually of the same sex. 

Having estimated CC by maximising the total conditional likelihood, the 
estimation of attributable fractions is also straightforward. The score is the 
sum over block scores, etc. Equation (20) stands, but equation (19) becomes 




where all block-specific quantities have been given an upper suffix in 
parentheses. Thus the p® are now calculated using only infections in block /'. 

9.3.7 Results 

The first question considered was the extent to which infection is acquired 
from schoolmates rather than simply from other children living in the same 
area. The model was fitted to onset dates for children attending Schools 1 
and 4, which are just over 1 km apart. Infections could be from children 
attending the same school, the other school, or from 70 additional cases 
among children living in the area. Transmission of infection could be 
enhanced if the infecter lived within 3 km of the contact, attended the same 
school, or was a sibling of the contact. 

Table 9.3 shows the results. The model parameter is ‘relative risk minus 
unity’; thus, it is zero if that risk factor has no effect. To illustrate the 
meaning of the table: the column labelled V shows the relevant model 
parameter divided by its standard deviation; parameters with z > 1.65 will 
correspond to risk factors that significandy enhance disease transmission 
(one-sided test, 5% significance level). The second to the last column shows 
the estimated percentage of cases attributed to the corresponding risk factor. 

Thus, children living in the same area were 185.8 times more likely to infect 
a contact than infecters who did not, and this effect is significant (z = 3.91 > 
1.65). 45.7% of infections were attributed to this risk factor. Nearly 36% of 
infections arise because the infecter attended the same school as the contact. 
Hence more than half the infections do not occur at school. 
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Table 9.3 Results* 



Risk Marker 


RR-1 


SE** 


z 


Attributed Fraction 


SE** 


Sibling 


8945 


1943 


4.60 


17.8% 


4.6% 


Same area 


184.8 


47.2 


3.91 


45.7% 


4.4% 


Same school 


450.3 


155.8 


2.89 


36.3% 


3.2% 


* RR denotes relative risk. The ‘z’ value is RR-1 divided by its estimated standard 
error. This table is based on 105 infections in Schools 1 and 4 from school-attenders 
and potential neighborhood contacts. 

** SE denotes standard error. 






Table 9.4 Results* 




Risk Marker 


RR-1 


SE** 




Attributed Fraction 


SE** 


Toilet block 


1.65 


1.06 


1.56 


13.0% 


1.3% 


Class 


1.45 


1.78 


0.81 


3.3% 


0.7% 


Sibling 


33.8 


7.36 


4.60 


13.2% 


3.2% 


Similar age 


1.67 


1.02 


1.64 


15.5% 


1.5% 


Live nearby 


1.92 


0.56 


3.46 


31.2% 


2.5% 


* This table is 


based on 


170 infections 


among school-attenders 


caused by 



schoolmates. 

** SE denotes standard error. 

The next step is to examine infections acquired purely from schoolmates in 
all four schools. Table 9.4 shows the results of fitting the model to onset 
dates classified by school, toilet block, class, and neighborhood. 

Extra infections attributed to sharing a toilet block with an infected 
individual are only 13% of the total infections acquired from schoolmates, 
and presumably, as mentioned earlier, some of this effect may not be due to 
direct contact with infected toilets. Figure 9.3 shows the upper 95% 
confidence limit on the attributable fraction obtained from the profile 
likelihood using the factors in Table 9.4. The lower 95% limit would lie 
below zero. This effect is not in fact statistically significant, and so may 
even be entirely absent. Hence the level of hygiene currently prevailing 
during outbreaks is certainly adequate, and little would be gained by extra 
spending on disinfection. 
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The effect due to sharing a classroom is also small at 4% and not quite 
statistically significant. 

The sibling effect is significant at more than four standard deviations, and 
siblings are very many times more likely to infect each other than non- 
siblings. However, less than 14% of infections arise in this way, as each 
contact has on average few siblings, but many schoolmates. 

Again, the parameter measuring increased transmission between children 
within a year of the same age also hovers near statistical significance. Such 
an effect if present would account for 15.5% of infections. 

Enhanced transmission between children who live close to each other 
certainly occurs (z > 1.65). This suggests that infection does not occur only 
on school premises. 

Finally, Table 9.5 shows the results of including those factors identified as 
important, returning to data from only the two schools in the same area first 
considered. Again, infections acquired from school toilets and from 
classmates comprise only 4% of the total. There may be increased contact 
with other children of the same age and attending the same school, and one 
fifth of all infections arise from schoolmates living nearby (within 0.75 km). 
This increased risk did not occur with children from other schools living 
nearby. 

It therefore seems possible that relatively few infections actually happen on 
school premises, but that attending a common school facilitates contact 
between children outside school hours, e.g. near their homes. From Table 9.5 
one could attribute an upper bound of 3.8 + 2.3 = 6.1% of infections to 
contacts occurring on school premises, excluding contacts with children of 
the same age, or 3.8 + 2.3 +21.1 = 27.2% including the latter. Not all of 
these necessarily occur on school premises, but at most about a quarter of 
infections can occur there. Hollins [45J reached a similar conclusion, that 
much infection is spread outside the school environment between 
neighboring families. 

9.3.8 Conclusion on the role of the schools in dysentery transmission 

The role of schools in spreading infection is of great interest to the public. 
Out of the total of 36% of infections that could be attributed to schools, 
perhaps only 6-27% of infections may arise on school premises. Since the 
four schools with the greatest incidence of S. sonnei were studied, the true 
attributable fraction of infections may well be even less than this figure. 
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Table 9.5 Analysis using only those factors identified as important* 



Risk Markers 


RR-1 


SE** 


z 


Attributed 

Fraction 


SE** 


Toilet block 


5.0 


16.3 


0.31 


3.8% 


0.46% 


Class 


9.7 


21.6 


0.45 


2.3% 


0.58% 


Sibling 


301 


68.0 


4.43 


17.2% 


4.5% 


Across sexes 


4.2 


3.1 


1.37 


15.3% 


1.7% 


Same area 


1.85 


1.23 


1.51 


13.2% 


1.8% 


Similar age and 
school 


22.3 


14.5 


1.54 


21.1% 


2.3% 


Live nearby and 
same school 


12.2 


7.4 


1.65 


19.8% 


2.0% 



* The table is based on 105 infections in Schools 1 and 4 from school-attenders and 
potential neighborhood contacts. 

** SE denotes standard error. 



The majority of infections are spread between children living in the same 
area, between siblings, or between schoolmates living close to each other. 
This last finding suggests that school attendance mediates children’s social 
contacts; children do not often acquire infection from children attending 
other schools, even if they live nearby. 

Closing schools during outbreaks of dysentery causes considerable economic 
loss. It would at best reduce the rate of infection by a quarter, and might 
even increase it, if children spend the freed time playing with neighboring 
children. 

School toilets are often thought by the public to be the root of all evil where 
dysentery infection is concerned. In this study, no statistically significant 
amount of infection could be attributed to this source. This finding thus does 
not support the conclusion of Hutchinson [46J that toilets represent a major 
risk factor. 



There would seem little point in attempting to reduce infections occurring 
within the classroom. It seems that sharing a classroom with an infected 
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child is not a risk factor. This is perhaps not surprising, as there will be little 
body contact in the classroom; it suggests that environmental contamination 
via fomites, etc. is negligible. However, one fifth of infections arise because 
of contact with those living nearby (within 0.75 km). It might be worthwhile 
encouraging parents to keep their children away from possible infecters 
during outbreaks, or to discourage their infected children from playing with 
others. 

9.4 AVENUES FOR FURTHER RESEARCH 

The practical aim of epidemiological research is to identify risks to public 
health, and to present reasoned evidence such that preventative action will be 
taken. 

We are moving into a time when great amounts of data and great computing 
power will be available to the researcher. There is a need for the science of 
statistics itself to develop in the area of model choice when very many 
models are considered by the ‘data miner’. It is also beginning to be 
necessary to evaluate the many different models and tests available in order 
to produce ‘good practice’ guidelines. 

The research presented in this chapter is based firmly on statistical 
principles. The rallying cry of ‘back to orthodoxy’ will never be popular, but 
we must not lose sight of basic principles in the welter of new possibilities 
opened up by increased computing power. 

Imaginative methods of modelling and of carrying out statistical inference, 
such as Cox’s partial likelihood, are required, in order to continue to make 
valid inferences about risk in the presence of confounding variables. 
Sophisticated numerical methods and algorithms are also needed to enable 
the rapid computation needed for epidemiological studies in the 21 st century. 
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SUMMARY 

Measuring health outcomes is critical for individual and societal decision 
making. This chapter briefly reviews the field of health outcomes modeling 
in general and provides detailed theoretical background for one specific class 
of such models, the Quality- Adjusted Life Years model, which is primarily 
grounded in operations research and utility theory. The chapter describes 
methodological issues and concludes with a discussion of promising areas 
for further research. 
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10.1 INTRODUCTION 

The measurement of health outcomes is a critical matter in medical decision 
making. When clinicians and patients make clinical decisions such as 
choosing among alternative medical treatments, they base at least part of 
their judgment on their perceptions of relative gains or losses in future 
health. The existence of a good metric, or quantitative system, for measuring 
future health resulting from alternative treatments would greatly facilitate 
the process of making such decisions. The ultimate goal of medical 
treatment is not to improve a particular clinical parameter, to eliminate 
particular symptoms, or to cut costs, but to improve health of patients. There 
is little dispute that improving health, in medicine, involves two main 
components: increasing life expectancy or “length of life” and increasing 
“quality of life” of patients [1]. Clinical outcomes defined in terms of 
mortality or physiological measures such as blood pressure or intermediary 
diagnostic test results, are often necessary, but insufficient, for making a 
final treatment decision. Patients’ preferences for health outcomes need to be 
captured and explicitly included when contrasting and evaluating alternative 
treatments for making medical decisions. Any health outcome measure 
would need to account, in some way, for both length and quality of life. 

Similarly, at the population level, capturing and aggregating those 
preferences is also often deemed necessary for evaluating new treatments, 
health services or medical technology. Failure to include such information 
may result in suboptimal decisions that do not conform to individual or 
societal preferences. For example, in cost-effectiveness analysis, a standard 
tool used in health economics, the costs and benefits of one health 
intervention are compared with costs and benefits of another by calculating 
the incremental cost-effectiveness ratio, which expresses the cost per 
additional unit of health benefit conferred for one intervention compared to 
another [2]. In such a model, the complete elicitation and estimation of 
relevant costs and the most representative and accurate measure of health 
benefits, or effectiveness, are needed. If a goal is to permit comparisons 
across diseases or conditions, health benefits can be expressed in generic 
terms such as “health-adjusted life years” (HALYs), as opposed to disease- 
or condition- specific terms (such as number of specific cases averted). 

HALYs can be viewed as a large field encompassing a number of 
measurement systems, which differ in at least three overall dimensions: (a) 
disease-specific versus generic measures; (b) non-preference versus 
preference-based measures; and (c) use for individual versus societal 
decision making. As mentioned before, a generic measure permits 
comparison of health benefits across diseases or conditions and is not 
naturally tied to a certain disease or condition (as would be the case with 
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physical measures such as blood pressure or total cholesterol level or a 
condition-specific rating scale such as a scale measuring back pain). As 
noted by Fryback [1], another fundamental difference between measurement 
systems is whether the numbers generated reflect individual preferences for 
different health states and thus are derived from human judgment about the 
relative desirability of being in one health state versus another, or are 
derived in a manner not directly related to preferences. For example, the 
eight scales of the short-form health survey SF-36™ [3] produce numbers 
that do not reflect preferences. Utility-based models such as the Health 
Utility Index [4], on the other hand, are specifically designed to reflect 
preferences. Finally, it is important to note that measures designed to support 
individual decision making may or may not lend themselves to aggregation 
across individuals in a population to assist in societal decisions. Thus, in 
terms of the applicability and validity of measurement systems, it is 
important to consider the viewpoint being adopted. Nord et al. [5], for 
example, have identified a number of limitations in aggregating individual 
measurements of health-related quality of life for assessing the societal value 
of health care investments and have proposed adjustments for dealing with 
such problems. 

A number of measurement systems have been developed by researchers 
from many different disciplines. In this chapter, we primarily review 
selected contributions of operations researchers, economists and 
psychologists who developed one of the most widely used, and criticized, 
class of HALYs - the quality-adjusted life years (QALYs). The QALY 
model is a generic, preference-based measurement system designed to assist 
in individual decision making. It is widely used for societal decision making, 
provided that its limitations are properly dealt with [5, 6J. In this chapter, we 
review some of the literature, present major methodological issues, and 
identify promising areas of research. 

10.2 QALY MODEL - THEORETICAL CONSIDERATIONS 

10.2.1 Background 

The concept and techniques of utility theory have been applied for health 
outcome measurement in order to incorporate patients’ preferences and risk 
attitudes. Such utility measurement techniques have been developed and 
applied, to a large degree, within the context of “chronic health states”. A 
chronic health state is generally defined as a health state that stays constant 
over a relatively long period of time (typically more than one year). Most 
real-life situations, however, challenge the assumption of a constant health 
state. Chronic diseases, even when treated, are generally not stable but lead 
to health status deterioration over time. Health states generally do not remain 
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at the same level over lengthy periods of time even in healthy individuals, 
for whom decrements are expected with the normal aging process. 

The most widely applied model for health outcome measurement in medical 
decision analysis is the quality-adjusted life year (QALY) approach. The 
QALY model has emerged as the gold standard for health outcome 
measurement [7J. Both life expectancy and quality of life are taken into 
account in a QALYs measure. The number of QALYs is typically obtained 
by multiplying life expectancy by a numerical weight associated with a 
constant health state experienced during the remaining life expectancy. The 
weight is a number between 0 and 1 where 0 is defined as “death” and 1 as 
“perfect health”. On this scale, the weight associated with a health state 
represents the health-related quality of life (HRQOL) of such health state. 
The product of the HRQOL weight and the life expectancy is a measure of 
the desirability of the health state experienced during the life expectancy. 
For example, as shown in Figure 10.1, the health of an individual who has a 
life expectancy of 20 years with a disease that has a HRQOL weight of 0.7 is 
valued at 20 x 0.7 = 14 QALYs. 



Figure 10.1 Illustration of QALYs in the case of constant health state 
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Extending the approach to sequences of chronic health states such as the 
sequence shown in Figure 10.2, one typically calculates the desirability of 
such a sequence by taking the sum of all products of duration and health 
weight corresponding to the health states in that sequence. For example, an 
individual with a health profile shown in Figure 10.2 would value that 
sequence at [(1x8) + (0.7x5) + (0.4x7)] = 14.3 QALYs. 

10.2.2 Theoretical foundation - Risk neutral QALY model 

The quality-adjusted life year (QALY) model is a measurement technique 
for health outcomes that takes into account both quality and quantity of life. 
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Figure 10.2 Illustration of QALYs in the case of non-constant health 

profile 
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It is the product of life expectancy and a utility-based measure of quality of 
life of the remaining years of life. The QALY method was developed in the 
1970’s [8]. The original theoretical properties of the QALY measure are 
summarized in a paper by Pliskin et al. [9]. They show that QALY is a valid 
utility function, which represents individual preferences, if three conditions 
hold. These conditions are as follows. 

1. Mutual utility independence (MUI) of life years (T) and health state ((?) 
This assumption means that preferences for gambles over either one of the 
two attributes, with the other attribute held at a fixed level, do not depend on 
the particular level of that other attribute. For example, an arthritis patient 
does not judge his own health state differently because he has five or 20 
years remaining in his life. If MUI holds, one can construct a multiattribute 
utility model for the health profile ( Q, T) as follows: 

U(Q,T) = a-U(Q ) + b-U(T) + (l-a-b)U(Q)-U(n 

where U(Q,T ) is the utility of health profile (Q,T); U(Q) is the utility of 
health state Q ; U(T) is utility of life years T\ a and b are scaling constants. 

2. Constant-proportional tradeoff property This requires that the proportion 
of the remaining life that one would trade-off for a specified quality 
improvement is independent of the actual amount of the remaining life. For 
instance, consider the situation where one asks an individual to trade off an 
amount of time of his/her remaining years of life in order to have perfect 
health versus the poorer health state. If he/she gives up 10 years out of 20 
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remaining life years, he/she would equally give up 5 out of 10, 2.5 out of 5, 
and so on. Thus, the proportional trade-off is constant, in this case always 
exactly half of his/her remaining life years. 

3. Risk neutrality over life years This assumption means that the utility 
function for life years is linear. If risk-neutrality over life years holds in all 
health states, MUI and constant proportional trade-off will also hold [10]. 

10.2.3 Theoretical foundation - Risk adjusted QALY model 

The above three assumptions are the requirements for the standard QALY 
model which assumes risk neutrality with respect to life duration and hence 
assures linearity of the component utility function over life years. However, 
the assumption of linearity is not empirically realistic. For example, McNeil, 
Weichselbaum, and Pauker [11] found that patients with bronchogenic 
carcinoma had moderate risk aversion over life years. Stiggelbout et al. [12] 
found mild risk aversion in male patients with testicular cancer. 
Additionally, Veihoef et al. [13] conducted a study with healthy women and 
found risk aversion over life years, but risk-seeking preferences over 
gambles involving short durations. On the contrary, in a different health 
context, Mehrez and Gafni [14] found risk aversion when the length of the 
durations increased. Thus, the violation of risk neutrality in the standard 
QALY model would lead to invalidity of QALY as a representation of an 
individual’s preferences. 

However, QALYs can be defined in either a risk-neutral (standard QALY 
model) or a more general risk-adjusted form (generalized QALY model), 
depending on whether the decision maker is risk neutral or not with respect 
to uncertainty regarding life years. If the decision maker is risk neutral with 
respect to life years, then QALYs will be decomposed in the following form: 

Risk-neutral QALYs = U(Q,T) = H(Q) x T 

The more general risk-adjusted QALYs are defined as follows: 

Risk-adjusted QALYs = U(Q,T) = H(Q) x [T] r 

In both formulations, H(Q ) is the quality weight, measured on a scale 
between 0 (death) and 1 (full health), and r is the risk parameter that defines 
the shape of the utility function for quantity of life. If the subject is risk 
neutral, r= 1 . If mutual utility independence and constant proportional trade- 
off hold, then risk-adjusted QALYs, as defined by Pliskin et al. [9], are a 
valid utility function representing preferences over constant health states 
[10]. 
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10.2.4 Theoretical foundation - The zero-condition 

Instead of the three assumptions established by Pliskin et al. [9], which 
require knowledge of concepts from utility theory, Bleichrodt et al. [15] 
suggested a more elementary and fundamental characterization of QALYs 
that can relax Pliskin et al.’s [9] assumptions. They found that risk neutrality 
together with the “zero-condition” are sufficient to imply the existence and 
validity of the QALY model. The “zero-condition” indicates that all health 
state levels are equivalent, from a quality-of-life perspective, to a zero 
duration of life years. The zero-condition seems unavoidable in the medical 
context. Thus, the only assumption that is needed to imply the existence and 
validity of the QALY model is the risk neutrality for all health states. 

However, there is ample empirical evidence showing a violation of risk 
neutrality as previously described. A generalized QALY model that can 
relax the assumption of risk neutrality has been established to solve the risk 
neutrality issue. A generalized QALY model has the following form: 

U(Q,D=V(Q)W(D, 

where U(Q,T) is the utility of the health profile (Q,T), V(Q) is the value or 
utility function over health states Q, and W(T) is the function that values life 
duration and can be nonlinear, with W(0) = 0. Instead of the risk neutrality 
condition, Miyamoto et al. [16] suggested another condition, “standard 
gamble invariance” (SG invariance). SG invariance basically says that, if Q 
and Q' are unequal to death and p is the probability equivalent oi(Q,T) with 
respect to (Q,Y) and (Q,Z), then p is also the probability equivalent of (Q\T) 
with respect to (Q\Y) and ( Q\Z) [16]. Without risk neutrality, a generalized 
QALY model holds if and only if both zero-condition and SG invariance 
hold. 

10.3 METHODOLOGICAL ISSUES 

The QALY model is subject to a number of methodological issues. Those 
include both theoretical and practical issues. Practical issues, especially in 
terms of development and use of alternative utility assessment methods 
suitable for eliciting and constructing the utility function over health states 
Q, have been discussed elsewhere (see, for example, [17]). Current popular 
methods include the visual analog scale (VAS), time-tradeoff (TTO), and 
standard gamble (SG) techniques. In the following sections, we primarily 
focus on theoretical issues. 
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10.3.1 Validity of MU I and constant proportional tradeoff 

Besides challenging the risk neutrality assumption, several studies on the 
validity of the other assumptions have been preformed. For example, 
Miyamoto and Eraker [18] tested the mutual utility independence 
assumption and found empirical support for this assumption. Bleichrodt and 
Johannesson [19] performed empirical tests on both the utility independence 
and constant proportional tradeoff assumptions. They found that without 
adjustment for imprecision of preference (imprecision adjustment was 
suggested because of the unfamiliarity of the subjects regarding both the 
health states and the elicitation methods), 22.8% of the subjects satisfied the 
constant proportional tradeoff assumption, 13.4% satisfied utility 
independence, and 5.8% satisfied both assumptions. However, with the 
imprecision adjustment, 90.1%, 75.8% and 88.8% of the subjects satisfied 
constant proportional tradeoff only, utility independence only, and both 
assumptions, respectively. The authors concluded that the constant 
proportional tradeoff holds roughly and utility independence holds, but in a 
much weaker way. Pliskin et al. [9] reported 25 pairs of time-tradeoff 
responses from 10 subjects in hypothetical questions concerning the relief of 
different levels of anginal pain. They found that only four out of 25 pairs 
were consistent with the constant proportional tradeoff assumption. 

10.3.2 Validity of utility theory 

Expected utility theory or the von Neumann-Morgenstem expected utility 
theory is the foundation for most health outcome assessment and 
measurement techniques. A utility function exists when certain axioms hold. 
Three axioms of expected utility theory [20], known as normatively 
compelling rules for rational decisions under uncertainty, are as follows. 
Here, X is a set of outcomes; A(A) is the set of probability distributions over 
X; y denotes an individual's preference relation over probability 
distributions; and ~ denotes the indifference relation over probability 
distributions. 

1. Weak order 

>- is asymmetric (p y q — » not [q y p]) and both >- and ~ are transitive 
(if p y (or ~) q and q y (or ~) r then p y (or ~) r for all p, q.re A ( X )). 
2. Independence 

For all p , q, re A (X) and any a e [0, 1], then p y q if and only if 
ocp + (l-a )ryaq + (1-a )r. 
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3. Continuity Axiom 

For all p, q, re A (X) such that p>-q>- r, then there exist a and (5 6 
[0, 1] such that a/? + (1-a )r >- g >- (J /? + (1-p )/\ 

Thus, in addition to the three required assumptions of QALYs described 
previously, how adequately QALYs represent preferences over health states 
also depends on whether QALYs are consistent with von Neumann and 
Morgenstem’s expected utility theory. If the axioms of von Neumann and 
Morgenstem’s expected utility theory hold true, decision makers should be 
able to make decisions that are consistent with their underlying preferences. 
However, in medical decision making, as in many other application 
domains, violations of all three axioms have been shown and are well known 
[21]. These include Allais’ paradox and Ellsberg’s phenomenon and are not 
reviewed here. Instead, we focus our attention to a more important problem, 
developing a proper decomposition for multistate health profiles as shown in 
Figure 10.2. 

10.3.3 QALYs for multistate health profiles 

In the case of multistate health profiles, QALYs are generally calculated as 
the sum of all products of duration and health preference weight for all 
health states representing the health profile. Bleichrodt [22] has shown that 
for such decomposition to hold, the assumption of additive independence 
must hold. In essence, additive independence requires that the preference for 
one health state be independent of preference for other health states in the 
multistate health profile. 

For example, consider the two multistate health profiles depicted in Figure 
10.3. Both Health Profile 1 and Health Profile 2 have been designed to 
produce the same amount of QALYs (the area under each curve). The two 
profiles are clearly different, yet the QALY model would rate them as 
equally preferred. Some individuals, however, may have a preference for 
one pattern over another. Many potential factors that define the pattern of 
health profiles might affect an individual’s preferences. The QALY 
framework currently fails to account for these factors. 

10.3.4 Violation of additive independence 

Several empirical studies have explored the validity of the additive 
independence assumption. Richardson et al. [23] examined the validity of 
the additive QALY model in a 16-year post-mastectomy health profile 
represented by a gradual deterioration and three health states: moderate side 
effects during the first five years, mild side effects for the next 10 years, but 
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then breast cancer would recur and the patient would experience severe side 
effects during the last one year. Sixty-three female respondents participated 
in the study. Rating scale, time- tradeoff and standard gamble techniques 
were used to assess utility for each health state and the holistic utilities for 
the health profiles. Preference scores from constituent states were combined 
to estimate scores for the health profile using a discount rate of 3% and 9 %. 
They found that holistic preferences for the multistate health profile 
(whether assessed with a rating scale, time-tradeoff, or standard gamble) 
were significantly different from composite preferences derived from the 
constituent health states, irrespective of the discount rate applied. 



Figure 10.3 Two multistate health profiles with equal amount of 
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Kupperman et al. [24] also investigated whether preferences for multiphase 
health states can be approximated by preferences from constituent health 
states. One hundred and twenty-one female subjects were asked to assess 
their preferences for eight health profiles, each composed of three to four 
health states, in the context of prenatal diagnosis choices (chorionic villus 
sampling and amniocentesis), by using visual analog scaling and standard 
gamble techniques. The authors explored whether a different statistical 
formulation could be derived to predict preference scores for health profiles 
from their constituent health states preference scores. They found that a 
duration-weighted additive model, as used in the conventional QALY 
model, was not predictive. A multiple regression model that derived from 
statistically inferred weights predicted the preferences for the profiles better 
than the duration-weighted model. 

MacKeigan et al. [25] used the time-tradeoff technique to compare 
preference scores for the same lifetime paths between holistic and composite 
assessment. One hundred and one participants with type 2 diabetes assessed 
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their preferences regarding four hyperglycemic treatment profiles lasting 30 
years, composed of eight discrete treatment states. The authors failed to find 
any significant differences between holistic and composite scores, which 
conflicted with the results from the studies by Richardson et al. [23] and 
Kupperman et al. [24]. However, the health profiles used in MacKeigan et 
al. [25] were different in that they consisted of progressive minor 
deteriorations in states while the health profiles in the other studies consisted 
of critically different health states. The authors noted that another reason 
why they found no difference between composite and holistic scores was 
because the profiles in the study were too similar. They recommended that 
future research be repeated with profiles that are more distinct and with 
sequencing effects that are more pronounced. 

In Spencer's study [26], three health states defined with the EuroQol 
classification system [27] were used in each multistate health profile. Each 
health profile in the study had a 10-year duration and contained three 
different health states with durations of three, three, and four years 
respectively. Two tests were conducted: a test of additive independence and 
a test of the overall additive model. Twenty-nine subjects participated in the 
study. The violation of additive independence was found in the first test. 
However, in the additive model test, only one of the two versions resulted in 
a rejection of the additive model. Thus, Spencer could not conclusively 
reject the additive model. The author suggested that a larger sample size 
might allow the test to be able to detect significant differences in the results. 
Also, comparisons of utilities based on holistic elicitation procedures and 
constituent states elicitation were performed. The results showed that two 
out of the seven profiles exhibited a significant difference between holistic 
and constituent states elicitation, which implied that the additive 
independence assumption was violated. 

10.4 FUTURE RESEARCH DIRECTIONS AND CONCLUSION 

The studies previously described clearly show the violation of the additive 
independence assumption. Thus, the additive decomposition for the 
multistate health profile does not work and does not come close to an 
acceptable estimation. Therefore, it is critical to investigate and formulate an 
alternative decomposition. 

A number of studies (some within the health domain, others in different 
domains) have explored or identified characteristics that affect people's 
preferences for multistate profiles. These influential factors could lead to, 
and partially explain, the violation of the additive independence assumption. 
A review of the studies exploring such influential factors is given below. 
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10.4. 1 Empirical studies 

Rate of Change Hsee and Abelson [28] performed experiments to find a 
relationship between satisfaction (utility) and rate of change of the outcomes 
or what they called velocity in the contexts of gambling (probability of 
winning the game), class rank (the percentile standing in a hypothetical 
class), and stock (a hypothetical stock price). They found satisfaction to be 
positively related to actual outcome position and rate of change (or velocity) 
of the outcomes over time. 

Chapman [29] rated ten sequences that had five different slopes for two 
overall trends (increasing or decreasing) in health and money domains using 
a 0 to 100 visual analog scale. Slope (rate of change) was found to be one of 
the significant factors impacting their rating scores. Subjects preferred 
gradually increasing or decreasing sequences to those with steep slopes. 

These results were in conflict with the findings by Hsee and Abelson [28], 
which suggested that subjects preferred steep slopes for increasing 
sequences but small slopes for decreasing sequences. However, Hsee and 
Abelson did not control for the total number of units of outcome over a 
specific period of time while Chapman did. Thus, preference for higher rate 
of change in positive outcome in the findings by Hsee and Abelson might be 
the result of a higher amount of outcomes received within the specific period 
of time. Ariely [30] also found a significant effect of rate of change in a 
study of retrospective pain evaluation in the experience of heat stimuli on the 
forearms. The results showed evidence of a rate-of-change effect, as the 
subjects reported experiencing higher pain when the intensity steeply 
increased than when it gradually increased. 

Trend Several empirical studies found a significant impact of trend of the 
overall profile (improvement versus decrement) on preferences [29-36]. For 
example, Chapman [34] explored preferences for improving or declining 
sequences in the domains of headache pain, athletic ability, facial acne and 
facial wrinkles. Those sequences were designed so that, if the additive 
assumption held, they should be equally preferred. She found that subjects 
strongly preferred the improving sequences to the declining ones. Moreover, 
Chapman [29] explored preferences for both sequences of health and 
monetary outcomes and found that subjects preferred improving sequences 
for both health and money for short sequences (1 year) whereas for the long 
sequences (lifetime), subjects preferred decreasing sequences for health but 
increasing sequences for money. She explained that the subjects preferred 
the decreasing sequences for lifetime health since they used their expectation 
as a reference point and exerted judgment by considering how close the 
profile in question was to their reference point. When considering a long 
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time horizon such as a lifetime, subjects expect perfect health early and 
gradually declining health as they get older. 

Loewenstein and Sicherman [31] also showed that a majority of subjects 
preferred an increasing sequence of wage profiles over a five-year period to 
a declining sequence. In a very different domain, Loewenstein and Prelec 
[32] found that a majority of subjects who reported having a preference for a 
French restaurant over a Greek restaurant also reported a preference for a 
dinner at a Greek restaurant first and at a French restaurant later, thus 
showing a preference for an improvement trend. 

S preading of Outcomes Loewenstein and Prelec [37] found that decision 
makers prefer outcomes that are spread across the time interval considered. 
For example, the majority of the subjects who were offered two free dinners 
preferred to distribute the two dinners across the time interval. This 
preference for spreading was confirmed by Chapman [38] who performed a 
study involving scenarios including both gains and losses in the contexts of 
monetary outcomes (win a prize or pay a fine), dinner (pleasant or 
unpleasant dinner), and health-related events (a painful trip to the dentist or a 
pain-relieving trip 

Peak, Final Outcome, and Duration of the Profiles In medical decision 
making, retrospective pain evaluation is an important matter since it reflects 
patients' memories of how painful the treatment was and could impact their 
decisions regarding future treatments. A number of empirical studies have 
demonstrated that retrospective pain evaluation is influenced by the peak and 
the final moment of the experience and not significantly impacted by the 
overall duration of the painful experience itself [39-45]. For example, Varey 
and Kahneman [39] asked 46 subjects to evaluate different discomfort 
profiles ranging from 15 to 35 minutes. They found that subjects’ 
evaluations were significantly impacted by peak and final intensity but not 
by duration. 

The same phenomenon was also found in the retrospective evaluation of 
watching pleasant and unpleasant video clips [41] and in patients’ 
retrospective evaluations of experiences in undergoing colonoscopy and 
lithotripsy [42]. Kahneman et al. [40] performed an experiment whereby 
thirty-two subjects immersed one hand in 14°C water for 60 seconds and 
immersed the other hand at 14°C for 60 seconds. Then the temperature was 
gradually increased to 15°C in another 30 seconds (total duration was 90 
seconds). The majority indicated that the long trial had less overall 
discomfort, showing final intensity effect and duration neglect. 
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Timing of Health Outcomes (Health Discounting Behavior) When 
evaluating health outcomes in the future, values of the outcomes are usually 
discounted. In cost-effectiveness analysis, discount rates are typically 
applied in order to deal with this time preference issue. Numerous studies 
have explored individuals’ discounting behavior. For example, the finding 
that discount rates decrease as delays increase has been found in the context 
of back pain [46J, colostomy, blindness, and depression [47], health and 
money [48-49]. In addition, the magnitude of the outcomes was found to 
impact health discounting behavior as well. Smaller outcomes were 
discounted at a higher rate than larger outcomes [48-49]. Another finding 
was an effect of the sign of the outcomes. Delayed gains were discounted 
more than delayed losses [50]. Ganiats et al. [51] studied health discounting 
for five different disease conditions (chicken pox, Parkinson’s disease, 
tropical disease, migraine headache, and sterilization) and found that 
discount rates were sometimes very high (up to 1 16%) and varied markedly 
across disease conditions. 

1 0.4.2 Conclusion 

The results of the studies described above can help researchers and decision 
makers understand the nature of the violation of the additive independence 
assumption and should assist in uncovering a more suitable decomposition. 
While those studies provide an excellent starting point, more empirical work 
needs to be performed. More importantly, we need to interpret the results in 
such a way that they can lead to, and be incorporated into, a new aggregation 
structure. At the same time, we need to develop a new theoretical foundation 
for the decomposition of multistate health profiles. It is necessary to extent 
the applicability of the QALY model to handle multistate health profiles 
appropriately, especially if one wants to apply such models to chronic 
conditions. 
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SUMMARY 

Operations research (OR) provides an excellent set of tools for decision 
makers who regulate the use of new treatments or medications. The decision 
about whether to use a new treatment must typically be made well before 
long-term trials or database studies can be conducted. However, large 
amounts of information about new treatments are available from the clinical 
trials required for drug registration. OR models can synthesize this 
information and use it to predict expected costs and benefits of long-term 
treatment use within a given population. Such analysis provides valuable 
additional information for the decision maker when a novel treatment is 
initially being considered. These analyses are like duct tape for the decision 
maker: they are designed to make use of the best currently available 
information to help current decisions, thereby bridging the gap until better 
information becomes available. 
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11.1 INTRODUCTION 

In a general sense, health economics and health outcomes research (which 
are both encompassed in pharmacoeconomics) are the study of the impact of 
new medications on society. They attempt to capture both the health 
benefits of a new medication such as improved survival (either through a 
cure or by slowing disease progression), lessened pain, improved 
functioning, and improved quality of life, as well as the economic impacts of 
a new medication. The preferred perspective is the societal point of view. 
However, other perspectives may be adopted, including that of a 
governmental or private health insurer or a patient. By capturing both 
benefits and costs, such methods illuminate the trade-offs involved in 
decisions about whether to use a medication in specific populations. 

New medications are brought to market after undergoing a series of rigorous 
clinical trials, which are designed to test the safety and efficacy of the new 
medication. These clinical trials are typically of comparatively short 
duration (such as six months) due to the costs incurred in running them and 
the difficulty in tracking patients over time. However, many diseases are 
chronic conditions that progress over time. For such diseases, it may be 
possible to demonstrate the efficacy of new treatments within the time frame 
of the clinical trial by examining disease markers (such as blood pressure for 
hypertension, viral load for HIV, or FEV [forced expiratory volume, a lung 
function test] for asthma) or by showing a lessening of symptoms or acute 
attacks (such as number of myocardial infarctions in heart disease, or pain 
with arthritis). However, the potential long-term impacts of new 
medications cannot be determined from short-term clinical trials. 

The economic consequences of new treatments are also not directly available 
from clinical trials. Beyond drug acquisition costs, many factors influence 
the overall economic impact of a new treatment. Acquisition costs can be 
offset by lessening the number of serious disease-related events that require 
hospitalization (such as a heart attack or stroke) or urgent care in an 
emergency room (such as severe asthma attacks or severe bowel disorders). 
A new treatment may reduce the number of medications that patients must 
consume. However, some costs may increase the economic burden of a new 
treatment. Side effects of a medication may require treatment in order for 
the patient to be able to continue to take the medication. Serious adverse 
events may even require hospitalization and medical procedures. The 
overall economic impact of a medication can be estimated by combining the 
observed results from the clinical trial with treatment patterns and costs of 
care seen in clinical practice. 
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The best methods for determining the long-term impact of new medications 
would be to either conduct naturalistic clinical trials after approval of a new 
medication or to conduct database studies of longitudinal data collected as 
the medication is used in practice. Such studies would provide the best and 
most reliable answers to questions about the future costs and benefits of a 
new medication. Unfortunately, this approach is of no use to decision 
makers who are faced with the questions now, at the time the medication is 
released. 

A decision maker may be asking the following questions: 

• How many future undesirable health consequences will the 
medication prevent? 

o How many hospitalizations will be avoided? 
o How many years of pain or diminished quality of life will be 
averted? 

o Will these adverse events be avoided or merely delayed? 
o Will the treatment have better/worse results in certain 
populations? 

• How many patients under my care will receive the medication? 

• What other treatment options exist, and how effective is each? 

• What adverse events are associated with the medication? 

o How often will such events occur? 
o What types of treatment do such events necessitate? 

• What is the budgetary impact of the medication? 

• Is the acquisition cost offset by other cost factors such as reductions 
in 

o Emergency room visits or hospitalization? 
o Use of other medications? 
o Administration costs? 

• What tradeoffs are made in choosing whether to accept this 
medication? 

Decision makers must find some way to convert the available short-term 
clinical trial information into information about the long-term impacts of a 
new medication. Operations research (OR) models are ideally suited for this 
task. They can model disease progression and the impacts of new 
medications. Depending on the available data, the models can range from 
simple decision trees to complex simulations. Three case studies of analyses 
that have been conducted using OR techniques are described in detail below. 
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11.2 DECISION TREE MODEL OF ANTIHYPERTENSIVE 
MEDICATIONS 

11.2.1 Application 

Hypertension is a chronic condition that can lead to very serious cardiac 
complications. Because of these complications, there is general agreement 
on the need to treat hypertension and on the economic benefits of 
antihypertensive therapy [1]. While exercise and diet are the preferred first- 
line treatment, many patients have only marginal success in following such 
recommendations. Second-line treatments include antihypertensive 
medications. Physicians have numerous pharmacologic options for 
hypertension therapy: over 100 antihypertensives spanning eight classes of 
therapy are on the market worldwide. If one therapy does not work, a 
patient can simply be switched to another [2]. 

A new antihypertensive medication, an angiotensin-11 inhibitor, was released 
in 1999. The new drug had similar or slightly improved efficacy to existing 
medications, cost more than most existing medications (many of which are 
generic), and had a different and mild adverse event profile. Clinical trials 
of this new drug were typically at most six months in duration. Decision 
makers wanted to know how this new drug would affect the managed care 
system. To answer this question, a decision analytic model was developed. 
The model was designed to explore the costs and consequences of treating 
mild-to-moderate uncomplicated hypertension starting with an angiotensin- 
II (A-II) inhibitor, relative to four other drugs - a diuretic, a beta-blocker, an 
angiotensin converting enzyme (ACE) inhibitor, and a calcium channel 
blocker (CCB). 

11.2.2 Methodology 

Current hypertension treatment patterns were ascertained from a literature 
review and a physician survey [3]. Key model data obtained from these 
efforts included the following: 

• A patient with uncontrolled mild-to-moderate hypertension is seen 
monthly. 

• Determining whether or not a specific drug is effective at lowering 
blood pressure can take up to three months. 

• During these three months, patients may increase the dosage of their 
medication or switch to another therapy. 

• Therapies may be switched either because 

o The patient has experienced intolerable adverse events, or 
o The drug has failed to control hypertension. 
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• Once hypertension has been controlled, the patient is seen every 
three months. 

The physician survey also provided information that was used to estimate 
the probability that a particular drug is chosen as first-line therapy and the 
probability of the choice of second-line therapy (given that the first-line 
therapy fails). These probabilities are shown in Table 11.1. The model 
assumes that the remaining drugs have an equal likelihood of being chosen 
for third-, fourth-, and fifth-line therapies. 



Table 11 .1 Physicians' probability of choosing hypertension 

medications 





Prob- 
ability of 
Choosing 
First-line 
Therapy 


Probability of Choosing Second-line Therapy 
Given That First-line Therapy Fails 


First-line 

Therapy 


2“ d line 
Diuretic 


2 nd line 
B-biocker 


2 nd 

line 

CCB 


2 nd line 
ACE 


2 nd line 
All 


Diuretics 


0.311 


— 


0.348 


0.140 


0.440 


0.072 


B-blockers 


0.356 


0.310 




0.082 


0.573 


0.035 


CCB 


0.072 


0.026 


0.375 


— 


0.564 


0.035 


ACE 

inhibitors 


0.239 


0.361 


0.252 


0.297 


— 


0.090 


A-II 

inhibitors 


0.022 


0.337 


0.288 


0.360 


0.015 


— 



Copyright 2001 Medicom International. Reprinted with permission. 



This information was used to construct a series of decision trees. The main 
decision tree (Figure 11.1) determines the outcomes associated with a 
specific sequence of drugs. All drug sequences are enumerated and the tree 
can be rolled back to any level using the probabilities shown in Table 11.1. 
Each branch is evaluated in terms of expected time to control and cost of 
choosing that particular sequence of drug therapies. 

For each medication considered along the branches of the main decision tree, 
it was necessary to determine the medication’s probability of achieving 
hypertension control, its adverse event rate and its cost. Since patients are 
typically started at low doses of medication and then have their doses titrated 
upwards as necessary for hypertension control, each medication can be given 
at varying doses. Each dose of medication has an associated probability of 
achieving hypertension control, adverse event rate and cost. The probability 
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Figure 11.1 Main decision tree: Drug sequences 




Copyright 2001 Medicom International. Reprinted with permission. 



of each drug being titrated was available from the comparative clinical trials 
of the new A-II inhibitor, as were drug dose efficacy and adverse event rates. 
Therefore, a series of additional decision trees using the titration likelihood 
and the effects of a given drug at each dosage level were used to calculate 
the drug's overall probability of achieving hypertension control, its overall 
adverse event rate and its overall cost. 

Adverse event treatment algorithms were designed for each adverse event 
based on hypertension severity level. However, it was believed that patients 
suffering from continual moderate and severe adverse events would not 
remain on drug therapy, whereas those with mild adverse events would. 
Therefore, adverse events (and hence adverse event costs) were divided into 
two types — first-quarter adverse event costs, which include all of the adverse 
events experienced during the clinical trials and are incurred only during the 
first three-month period when a patient is placed on a new medicine, and 
maintenance adverse event costs, which include only costs resulting from 
mild adverse events and are incurred quarterly while a patient remains on 
medication. The model incorporates costs of drugs, physician visits, and 
adverse event treatments. Table 11.2 summarizes the efficacy and cost inputs 
that the model requires. 
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11.2.3 Results 

Combining the cost and efficacy information presented in Tables 11.1 and 
1 1.2 in the decision tree model structure shown in Figure 1 1.1, it is possible 
to roll the decision tree back to the choice of initial therapy. Therefore, the 
baseline results, shown in Table 11.3, are the weighted average of all 
pathways possible for each initial drug prescribed. For example, results 
reported for the diuretic are the weighted average of the results of all drug 
sequences in which the diuretic is initially prescribed (24 possible 
sequences, as shown in Figure 11.1). 

Table 11.2 Efficacy and cost data for decision tree in Figure 11.1 





Efficacy 


Costs 


First-line 
Drug Therapy 


Probability 

of 

Hypertension 

Control 


Expected 
Quarterly 
Drug Cost 
(given 
successful 
treatment) 


First 
Quarter 
Expected 
Adverse 
Event Costs 


Quarterly 
Expected 
Maintenance 
Adverse 
Event Costs 


CCB 






$850 


$104 


C-Blocker 




$4.37 


$804 


$39 


ACE Inhibitor 


0.53 


$106.35 


$639 


$57 


Diuretic 


0.71 


$1.09 


$301 


$56 


A-1I Inhibitor 


0.72 


$127.11 


$332 


$43 



Copyright 2001 Medicom International. Reprinted with permission. 



The model time horizon is initially set at 15-months, the longest time it 
would take to cycle through all possible first line medications. The measure 
of efficacy (expected time to hypertension control) is not dependent on the 
model time horizon, however, the cost results are dependent on the time 
horizon as they are continually accruing. Therefore, in Table 11.3, the initial 
expected total 15-month cost is presented as well as the expected total costs 
that would be accrued every three months thereafter. In this manner, a 
decision maker can choose a time horizon of interest and calculate the total 
costs over the entire time horizon from these results. 
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Table 11.3 Efficacy and costs results (expected time to control and 
expected costs) from the decision tree analysis 





Comparison Between First-Line Therapies 
with/without a new All 


B- 

Diuretic Blocker ACE CCB A-II 


Expected Time to 
Control (months) 

No A-II Therapy 
With A-II Therapy 


nm 


Expected Total Cost 
(at 15 months) 

No A-II Therapy 

With A-II Therapy 


$2,075 $2,434 $2,846 $3,013 

$2,057 $2,426 $2,838 $3,018 $2,392 


Expected Quarterly 
Maintenance Cost 
(accrue every 3 
months after the 
initial 15 month 
period) 

With A-II Therapy 


$235 $228 $298 $347 $309 



Copyright 2001 Medicom International. Reprinted with permission. 



For any given initial therapy except the CCB, the inclusion of the A-II 
inhibitor in the subsequent therapeutic options reduced the expected costs (at 
15 months). The reduction in cost was mainly due to the lower initial 
adverse event costs of the A-II inhibitor and the ability to avoid using the 
CCB, which is by far the costliest drug in terms of drug acquisition and 
adverse event costs. 

Initiating therapy with the A-II inhibitor (which is not currently common 
practice) is the second least expensive option. However, the savings over 
the other therapies that the model predicts during its 15-month time horizon 
will be reduced over time given the A-II inhibitor’s drug acquisition costs 
and expected quarterly maintenance costs. 

Extensive sensitivity analyses were conducted to test the stability of the 
model and its results. These analyses and the full model are reported in the 
literature [4-7]. 
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The information gained from this analysis provided cost and efficacy 
estimates to decision makers before the new drug was used in clinical 
practice. In addition, it provided important input for two different policy 
questions: Should the drug be available as a second or later therapy once the 
standard initial therapy has failed? Should the drug be made available as a 
potential first-line therapy? The goal of the analysis was to provide decision 
makers with information that can be used to improve the therapeutic options 
of hypertension care. 

11.3 MONTE CARLO SIMULATION OF HIV/AIDS VIRAL LOAD 
TESTING 

11.3.1 Application 

This case study examines a question that arose when viral load testing was 
still novel. However, it demonstrates a technique that remains pertinent with 
the introduction of any new monitoring/testing method. The question of 
interest is the following: Given a new method for testing how well a patient 
is responding to medications, how frequently should the test be conducted? 

This question is of great importance with the human immunodeficiency virus 
(HIV) infection since the virus mutates rapidly over time and can become 
resistant to medications. When a patient’s virus becomes resistant to the 
medication, viral load levels in that person increase rapidly. High levels of 
viral load damage a patient’s immune system and lessen the person’s ability 
to fight off common infections. Therefore, it is important to catch the point 
of viral resistance to medication as soon as possible so that the patient may 
be placed on a new medication. 

In HIV, the matter is further complicated by the question of adherence to 
medications. The medications used to treat HIV cause numerous unpleasant 
side effects and are difficult to take. The medications must be taken 
continuously every 8, 12, or 24 hours. When a patient ingests subtherapeutic 
levels of medication due to repeated “drug holidays” or other nonadherence 
to the medication regimen, such as skipping doses or only taking certain 
drugs in a combination therapy, the virus develops resistant mutants more 
rapidly. This shortens the period of beneficial effects from the medication 
and allows for the development of multi-drug-resistant HIV strains. Current 
treatment guidelines strongly emphasize the importance of good adherence 
to medication, and delineate the possible negative consequences of 
nonadherence [8-10J. 
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For HIV, the testing question is thus the following: Considering both the 
inherent progression of the disease and the possibility of non-adherence to 
medications, what is the optimal viral load testing frequency? 

The frequency of viral load testing determines how quickly viral rebound is 
detected and how soon a patient is switched to the next therapeutic option. 
Thus, the frequency of viral load testing may affect the cost of treatment, the 
pattern of antiretroviral drug use, and (possibly) the quality of life and life 
expectancy for HIV-infected individuals. Annual costs of care and the 
lifetime cost per person may be affected by differences in the duration of 
highly active antiviral therapy (HAART) drug regimens, how soon patients 
are placed on more expensive (four-drug) therapies, and the cost of treatment 
for opportunistic infections and other medical care for individuals at 
different levels of immune suppression. Patterns of therapy are affected 
because different monitoring frequencies may cause regimens to be 
administered for different lengths of time. Patient outcomes may also be 
affected by different progression rates induced by the varying durations of 
suboptimal therapy. 

A Monte Carlo simulation was designed to examine the question of optimal 
testing frequency. The simulation captured HIV disease progression in the 
presence of medications and their varying efficacy and levels of medication 
adherence. Using data on costs and consequences of HIV disease, the model 
estimates health outcomes and costs for patients undergoing three different 
frequencies of viral load testing (every month, every three months, and every 
six months). Four hypothetical populations, described by disease stage and 
rate of disease progression, are examined. These groups are patients with: 

1. Moderate disease stage, average disease progression; 

2. Moderate disease stage, fast disease progression; 

3. Moderate disease stage, slow disease progression; and 

4. Severe disease stage, average disease progression. 

These population groups are analyzed under varying assumptions about 
adherence to medication. This disaggregate analysis is performed to capture 
the possible influence of each of these factors (disease stage and rate of 
disease progression) on the impact of viral load testing frequency. 

11.3.2 Methodology 

A Monte Carlo simulation is performed for each population group, tracking 
the disease progression of individuals for five years. The population groups 
are distinguished by their initial average CD4 cell counts (a measure of 
immune system function - higher numbers of these cells are better) and their 
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initial average baseline viral loads. These two parameters provide estimates 
of how advanced the disease is and how fast individuals are expected to 
progress, and thus define the four groups described above. 

Figure 11.2 provides a schematic of the Monte Carlo simulation. During 
each simulation, individual patients are simulated and their baseline viral 
loads and CD4 cell counts are determined randomly, according to the 
population’s probability distributions. The results reported here represent 
outcomes for a simulated population of 5,000 individuals. The model tracks 
on a monthly basis each simulated patient’s CD4 cell count, viral load, AIDS 
status, possible death, testing costs, drug therapy costs, and medical care 
costs. 




Patients are treatment naive at the start of the model (that is, their viral 
strains are not resistant to any of the available medications). Patients are 
followed for five years, during which time they are treated with a sequence 
of combination drug regimens. The regimens are chosen from the consensus 
statement of the International AIDS Society - USA Panel [10]. When a 
therapy is first effective, viral load is undetectable and CD4 cell counts 
increase. As a therapy continues to be effective, the viral load remains 
undetectable and no CD4 cells are lost. Once the patient’s viral load 
becomes detectable, his/her CD4 cell count declines at a rate determined by 
the initial viral load. If the patient’s viral load is detectable when he/she is 
tested, the patient is switched to the next drug regimen. 








OR MODELS IN PHARMACOECONOMICS 287 



Patients have a 40 percent likelihood of being nonadherent to antiretroviral 
medications during the first combination therapy. When a patient is 
nonadherent, the viral load level rebounds sooner (as determined by a 
probability distribution) and the patient must switch to a new drug regimen. 

Within the model, the following parameters are simulated for each 
individual from a probability distribution (in parentheses): 

• Initial viral load and initial CD4 cell count (truncated normal and 
uniform distribution, respectively); 

• Rate of CD4 cell count decline given a viral load level [11] (uniform 
distribution); 

• Monthly probability of progressing to AIDS, depending on CD4 cell 
count (discrete distribution); 

• Monthly probability of death, depending on CD4 cell count (discrete 
distribution); 

• Probability that a therapy will be effective (varies with therapy type 
and order) (discrete distribution); 

• Duration of effectiveness of a given therapy (varies with therapy 
type and order) (truncated normal distribution); 

• Increase in CD4 cell count given an effective therapy (uniform 
distribution); 

• Probability of patient nonadherence during the first therapy (discrete 
distribution); and 

• Monthly probability that resistance develops due to nonadherence 
(discrete distribution). 

The following parameters have set values for all individuals: 

• Cost of drug therapy [12]; 

• Testing cost; 

• Other medical care costs (dependent on CD4 count and AIDS 
status); and 

• Salvage therapy costs once the antiretroviral medications have been 
exhausted. 

11.3.3 Results 

Results for each of the four populations are presented in Table 11.4. The 
outcomes assume that each population is composed of 5,000 individuals. 
(This number was chosen so that the simulations would have time to 
converge.) Smaller populations, as would be seen in clinical practice, will 
have results distributed around the mean of the larger population, depending 
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on the variance about the mean and the actual size of the smaller population 
considered. This distribution implies that actual practice experience may be 
different from the population’s true mean. 

Table 11.4 shows the incremental results of implementing viral load testing 
every month, every three months, and every six months. Results are 
expressed in terms of incremental quality-adjusted life years (QALYs) 
gained and incremental costs. In Populations 1, 2, and 3 (slow, average, and 
fast progressors, respectively, in a moderate disease state), the decrease in 
antiretroviral drug costs and decreases in other medical care costs offset the 
increase in testing costs when testing frequency is increased from every six 
months to every three months. Increasing testing frequency from every three 
months to every month increases costs (due to the additional testing costs), 
but yields no appreciable gain in QALYs. Thus, this option is not cost- 
effective. In Population 4 (advanced disease state in an average progressor), 
lowering the testing frequency from every six months to every three months 
also results in a net cost savings. Lowering testing frequency further to 
every month increases costs due to the increased testing costs. However, 
since Population 4 is in an advanced disease state, there is a small gain in 
QALYs. The incremental cost-effectiveness ratio is $23,400 /QALY. This 
value is low compared with many currently accepted interventions, and it 
can easily be argued that this option is cost-effective. 

This analysis permitted an investigation that was not possible at the time of 
the original question. Given the best available data at the time, the analysis 
provided insight into a series of decisions that the managed care companies 
were facing when they first included HIV viral load testing in their benefit 
packages. As new tests - both for HIV and other diseases - become 
available, a similar type of model can be constructed to provide insight into 
the most appropriate testing frequency. 

11.4 MARKOV MODEL OF CANCER TREATMENT MEDICATIONS 

11.4.1 Application 

This case study demonstrates an application of OR techniques for 
comparative analysis of new medications still in development. This is the 
most hypothetical case study since few clinical trials exist from which to 
gather efficacy and safety data. However, even in the earliest stages of new 
drug development, it is possible, and frequently advantageous, to examine 
the drugs in terms of their potential benefits and costs to the end users (and 
end decision makers). 
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Table 11.4 Incremental cost and QALYs by testing frequency 



Change in 
QALYs 



Change in 
Cost (per 100 
patients/yr) 



Incremental 
C/E Ratio 



Change in 
QALYs 



Change in 
Cost (per 100 
patients/yr) 



Incremental 
C/E Ratio 



1 — Moderate Disease 
Stage/ 

Average Progressors 


2 — Moderate Disease 
Stage/ 

Fast Progressors 


3 mos. to 1 
month 


->y- 


3 mos. to 1 
month 


6 mos. to 3 
mos. 


No Change 


Very Small 
Increase 


No Change 


Very Small 
Increase 


Increase of 
$9,400 


Decrease of 
$47,400 


Increase of 
$13,600 


Decrease of 
$48,500 


N/A 


Cost-saving 


N/A 


Cost-saving 


3 — Moderate Disease 
Stage/ 

Slow Progressors 


4 — Advanced Disease 
Stage/ 

Average Progressors 


3 mos. to 1 
month 


6 mos. to 3 
mos. 


3 mos. to 1 
month 


6 mos. to 3 
mos. 


No Change 


Very Small 
Increase 


Very Small 
Increase 


Small 

Increase 


Increase of 
$13,000 


Decrease of 
$47,900 


Increase of 
$7,100 


Decrease of 
$51,500 


N/A 


Cost-saving 


$23,400/ 

QALY 


Cost-saving 



Treatment of solid cancer tumors remains a serious unmet medical need. 
Broad acting agents could provide significant treatment advances. Current 
chemotherapies can provide palliation and increased survival time. 
However, they are highly toxic and are only effective in specific patient 
populations and/or for specific tumor types. The accumulation of new 
genetic and biological information about cancer is creating the possibility of 
developing new drugs with broad activity and less toxicity than current 
chemotherapies. 
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In this case study, the objective was to produce a basic, flexible computer 
model that incorporates the treatment paths, costs, and outcomes associated 
with the management and treatment of solid cancer tumors. This model 
incorporated data on available treatments, their efficacy, and associated 
adverse events. This permits a comparison between existing treatments and 
potential new medications, and allows the model to be used by a variety of 
decision makers such as managed care companies or pharmaceutical 
companies themselves who want to know what levels of safety, efficacy, and 
cost would need to be seen in potential new treatments to make them a 
valuable treatment option in comparison to existing options. 

More specifically, the model examines the potential effects of new 
anticancer compounds on the health benefits (mortality, disease-free 
survival, etc.) and total treatment costs of solid cancer tumors. The basic 
structure allows the model to be quickly adapted to different solid tumor 
cancers such as breast or colorectal cancer. The treatment costs included in 
the model are the costs of surgery, chemotherapy, radiotherapy, hormone 
therapies and procedures, and diagnostic therapies and procedures. Into this 
mix of treatments, the new medication can be added as a replacement 
therapy or as an adjuvant therapy. The model also calculates the impact of 
the new compounds on patients’ quality of life. 

11.4.2 Methodology 

The base model was constructed using a Markov framework. Figure 1 1.3 is 
a diagram of treatment pathways. Patients enter into the model in one of 
these disease states and then follow the pathways, marked in arrows, based 
on a set of transition probabilities. The transition probabilities account for 
disease progression over time, which incorporates both the natural disease 
progression as well as any impact of the selected treatments on slowing 
disease progression. As a patient passes through each state, costs and quality 
of life values are accrued. 

Health states for the solid tumor cancers used current information from 
several sources, including: the American College of Surgeons (ACS) [13], 
the American Cancer Society [14], the National Cancer Institute (NCI) [15], 
and the National Comprehensive Cancer Network (NCCN) [16]. 

Transitions between these health states are determined by the rate of 
progression of the cancer and the therapy provided at each health state. To 
capture the natural cancer disease progression rates, a cycle time of three 
months was chosen. The natural, untreated, cancer progression between 
health states is summarized in a probability matrix that quantifies the 
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Figure 1 1 .3 Schematic for Markov Model of solid tumor cancers 




likelihood that a patient will progress to another health state during a given 
three-month period. The transition probabilities required by this model were 
not direcdy available from the published literature. Therefore, it was 
necessary to calculate the transition probabilities from available data on 
mortality, morbidity, progression to metastatic disease, and disease-free 
survival. The transition probabilities were heuristically developed by 
iteratively solving the transition matrix to provide the mortality, progression 
to metastatic disease, and disease-free survival from varying starting stages 
of cancer. All information used to calculate the health state transition 
probabilities came from DeVita’s Cancer Anthology [17]. 

The impact of treatments were incorporated in one of two ways, depending 
on the available data. When possible, typically when there was data from a 
comparative clinical trial, the untreated cancer progression rates were 
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modified by the observed changes in relative risk of disease progression. 
When this information was not available, the same heuristic approach used 
to create the base case transition matrix was used to recalculate the entire 
transition matrix from the mortality, morbidity, and disease progression 
observed in longitudinal studies of the specific treatment. 

Health state utilities came from the published literature for each cancer, a 
good starting source being Teng and Wallace [18]. Since cancer treatments 
themselves affect a patient’s quality of life, the utilities were dependent both 
on the health state itself as well as the treatment selected. 

The main challenge with this model is finding sufficient baseline data. Each 
of the states shown in Figure 11.3, with the exception of “disease free”, 
“supportive care” and “watch and wait”, may have four classes of treatment 
options (surgery, chemotherapy, hormone, and radiation therapies) available. 
Since these classes of therapy may be given in combination with each other 
(e.g. [surgery and chemotherapy] or [chemotherapy and radiation therapy]), 
there are 14 combinations of treatment classes that may be provided. In this 
discussion, a combination of treatment classes is called a “treatment 
category”. Within each of these treatment classes there may be several 
different medications and/or procedures that could be used. Each potential 
treatment option within treatment category is associated with a unique 
probability of transitioning to the other states, costs, health state utilities, and 
likelihood of adverse events associated with therapy. As a rough estimate, 
there are seven states, with fourteen treatment categories per state, so even if 
there were only 5 treatment options per treatment category, there would be 
about 490 different treatments for which to find data. The most feasible 
approach given the immense amount of data required by the model and the 
scarcity of suitable data sources is to limit the number of options that the 
decision maker can compare. 

To determine the “best” (most commonly used and most relevant 
comparators for the novel medication) treatment options to present to the 
decision maker, a clinical oncologist reviewed practice patterns at his large 
oncology program. This clinical oncologist chose and verified the top three 
treatments, procedures, and/or diagnostics used in the management of the 
specific cancers for each health state. The clinical consultant also provided 
estimates of the percentage of use for each treatment in each health state. 
Default values for the percentage utilization rates for each treatment 
category were obtained using data from the American College of Surgeons 
National Cancer Database (NCDB) [13], which details current treatment 
methods for each type of cancer. 
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To calculate the expected cost by treatment category, the per patient costs 
associated with each treatment option and the per patient likelihood of the 
option being used were estimated, and then used to calculate the expected 
three-month cost of each treatment category, per patient, by health state. The 
costs of adverse events are also included in the total costs for each treatment. 
For example, the baseline cost of a specific chemotherapy includes the three- 
month cost of the chemotherapy in addition to the cost of treating the 
adverse events associated with that chemotherapy over the three months. 
Three month costs are calculated since this is the cycle time of the Markov 
model. 

The types of costs that can be accrued in each health state are separated into 
seven categories. These are costs for: diagnostic tests and procedures; 
surgery; chemotherapy; radiotherapy; hormone therapies and procedures; 
other treatments; and new treatments. The category “other treatments” 
includes any cancer treatment that does not fall into the above categories; 
examples include biotechnology products (e.g., Herceptin (a monoclonal 
antibody)) and pain medications. The new treatment category is where the 
decision maker includes the new cancer treatment in development about 
which the comparisons are to be made. The new treatment can function 
either as a stand-alone treatment category or as new part of an existing 
treatment category. 

The model can then be run comparing different treatment options at different 
stages of cancer, most specifically, those containing the new treatment as 
compared to those that do not. 

11.4.3 Results 

The model outputs include the expected time patients spend in each health 
state, the costs and benefits accrued in each health state, and the total costs 
and total benefits in terms of survival and quality-adjusted life years. In 
addition, if the decision maker sets a monetary value for a life year or a 
quality-adjusted life year, the model calculates the net benefit of the 
treatment (total monetary value of the benefits minus the total costs). All 
results are based a time horizon of 25 years, so that lifetime data is collected 
for most, if not all, patients, depending on cancer progression rates. If two 
therapies are considered, the model provides the following comparative 
analyses: 

• Incremental total expected costs by health state; 

• Incremental total expected quality-adjusted life years by health 
state; 

• Incremental expected time spent in a given health state; 




294 OPERATIONS RESEARCH AND HEALTH CARE 



• Incremental total expected survival time; and 

• Incremental (monetary) net benefit. 

In addition, the model user may use the model to measure other factors, such 
as time to progression and survival times. 

This was an interesting project because it demonstrated the information that 
could be obtained very early in a new medications development that could 
be useful in deciding on the “must have” features in terms of safety and 
efficacy, as well as cost, in order for a new medication to be a valuable 
addition to the current treatment options for cancer. Despite rough data, a 
base model could be constructed that passed top-line medical scrutiny. The 
model provided information to the internal development team responsible 
for the development of the new cancer medications and could be used to 
gather decision makers' impressions about various new medications in early 
development. As the results of clinical trials for new drugs become 
available, the model can be updated to reflect the new information. The new 
costs and benefits can then be shown to decision makers to gauge their level 
of interest in the new drugs. This model is very versatile and can provide 
useful information to a variety of potential end-users. 

11.5 CONCLUSIONS AND AVENUES FOR FURTHER RESEARCH 

The combination of OR and health economics/health outcomes research has 
a great deal of potential for providing useful, practical information for 
decision makers. The challenge will be to bring the scientific rigor and 
standards of OR to these fields to ensure that the best possible models and 
analyses are provided when using common modeling methodologies, such as 
decision trees, Markov models, Monte Carlo simulations, and other 
mathematical simulations. 

It is important to remember that while models provide good estimates about 
the potential long-term impacts of medications, they are only estimates. As 
the results from long-term studies of new drugs become available, they 
should be used to update the models and form a basis for reevaluating 
medications and their role in fighting any given disease. The models are 
duct-tape for the decision maker: they are designed to make use of the best 
currently available information to help current decisions, thereby bridging 
the gap until better information becomes available. 
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SUMMARY 

Illicit drugs create serious health problems whose management is 
complicated by illegality, poor data, and market dynamics. Quantitative 
analysis can and does play a key role in clarifying implications of strategic 
choices concerning collective response to these problems. This chapter 
summarizes key arguments and findings concerning the effectiveness of 
various prevention and treatment strategies, including supply control 
measures. Among them are that conventional prevention programs are not 
very effective in an absolute sense, but they are so cheap that they are cost- 
effective. Likewise, treatment programs can be cost-effective despite very 
high relapse rates, in part because periods of heavy use impose such 
enormous costs on society. Enforcement can play a key role in diffusing the 
positive feedback loop created by contagious spread of initiation during the 
early phase of new drug epidemics because of its unique ability among 
diverse drug control interventions to focus its impact on the present. 
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Drug control, Optimal control. Resource allocation, Epidemic control 
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12.1 INTRODUCTION 

Illicit drug use is an important health problem. Some 600,000 emergency 
department episodes in the US every year are related to illicit drugs [1]. 
National mortality estimates are not available, but probably on the order of 
20,000 drug-induced deaths occur each year [2], with many more indirectly 
related to drug use. Some 5 million Americans are in need of drug 
treatment, but fewer than 40% receive it [3, 4]. Injection drug use is a 
leading cause of the spread of infectious diseases such as HIV/AIDS and 
Hepatitis C [5J. The social costs of illicit drug use approach those of alcohol 
and tobacco [6-8]. No one has estimated how many quality-adjusted life 
years are lost due to illicit drug use, but the number is no doubt substantial, 
particularly since those who die from illicit drug use are younger than those 
who die from most other causes. 

Not surprisingly, there is an energetic debate concerning how best to control 
drug use and related consequences. Operations research and management 
science have made important contributions to this debate. However, drug 
policy is unlike other health policy domains in important ways. This chapter 
begins with a review of some important differences. The following sections 
then highlight key insights that quantitative models have generated 
concerning the relative effectiveness of different interventions, including 
how that effectiveness varies over the course of a drug epidemic. 

12.1.1 How is drug policy different? 

Drug policy differs from other health policy domains in a number of 
respects. First, we care as much about outcomes for other people as we do 
about outcomes for the person with the “condition”. Cancer generates health 
consequences for people other than the patient, such as stress and depression 
among family members. Nevertheless, the focus of cancer treatment and 
policy is clearly and appropriately on the people who have cancer. 

The consequences of drug use are more diffuse. Fear that addicts or addicts’ 
suppliers will hurt non-users is an important source of public concern about 
drug use. One can argue that such fears are exaggerated. However, drug use 
has other health consequences for non-users that are under-appreciated. For 
example, addiction of all kinds, including to illicit drugs, is an important 
contributor to child abuse and neglect. 

In this respect, drug policy is more like a public health problem than a 
medical problem, and the behavioral component invites comparisons to 
second-hand smoke and drunk driving accidents rather than malaria or 
cholera. However, drug use is very much a “contagious” phenomenon that 
can usefully be studied by epidemic models, as will be discussed below. So, 
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analyzing drug policy merges important strands from behavioral health and 
the contagious disease aspects of public health. 

Another fundamental difference between drug policy and other health policy 
problems is that the underlying activity is illegal. This has myriad 
ramifications, ranging from making data collection difficult to the fact that 
law enforcement plays an important role in controlling the prevalence and 
consequences of this “health problem.” 

An important consequence of drugs’ illegality is the existence of black 
markets, which are the proximate source of many drug-related harms [9]. 
Such markets would not exist were it not for the drug use, and reducing drug 
use (e.g., through treatment) shrinks the markets. Thus, a systems analysis 
should consider market outcomes. The need to do so distinguishes drug 
policy from other health policy domains. There is no market for heart 
disease, and with some exceptions, such as so-called “nuisance bars” [10], 
the markets for tobacco and alcohol are not themselves a major problem. 

A subtle consequence of the illegality of drug use is that it encourages the 
lumping together of all types of use because they are all the same in the eyes 
of the law. Not only does this blur distinctions between substances with 
very different health risks (the Drug Enforcement Administration places 
both marijuana and heroin in its most restricted - “Schedule I” - category of 
drugs), but it also blurs distinctions between dependent and non-dependent 
use. 

Drug dependence is a well-defined medical condition that can be diagnosed 
and treated. Recreational use by non-dependent persons is not a well- 
defined condition, and more often than not it does not lead to dependent use. 
Thus the vast majority of people who use an illicit drug never have a drug- 
related medical condition (even though they help support a drug market that 
generates adverse health outcomes for others). 

To complicate matters further, many of those with this medical condition 
deny they have it and/or are ambivalent about getting rid of it. This attitude 
contributes to very low compliance with treatment. It has become common 
to point out that compliance rates (e.g., rates of testing negative for drugs) 
are not so different from rates of compliance with medical regimens for 
conditions such as hypertension or diabetes (e.g., admonitions to alter one’s 
diet). However, few diabetics want to have diabetes, whereas quite a few 
drug users are not sure they want to stop using drugs. Furthermore, many 
dependent users do not have a health insurance company or personal 
physician vested in addressing their dependence. 
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More such differences could be noted, but these suffice to make the basic 
point that drug policy is a part of health care policy that necessarily must 
draw its “system boundary” quite broadly. One could examine just “drug 
treatment policy,” focusing on issues such as queue management and 
matching treatment modalities to patients (e.g., [11]), but that is a different 
topic. 

12.1.2 Scope and methods of analysis 

This chapter focuses on insights from quantitative analysis of “strategic” 
drug policy choices. At the highest level, this paradigm views drug policy as 
a resource allocation problem. Some governmental entity decides how many 
resources to allocate to drug control and how to divide those resources 
across broad programmatic areas in order to achieve the greatest impact. 
Such analysis is helpful because a variety of drug control strategies exist, 
and the drug “system” is complex, so it is not intuitively obvious what the 
best combination of strategies is. 

Analyses of this sort began to appear in the 1970s in response to the heroin 
epidemic, (e.g., [12-14]), and became more common after the spread of 
cocaine. Early contributors to this second wave included groups at the 
RAND Corporation’s Drug Policy Research Center [15-24], UCLA (e.g., 
[25-27]), Carnegie Mellon University’s Heinz School [28-33], and later the 
Technical University of Vienna (e.g., [34-39]) as well as individuals 
elsewhere (e.g., [40, 41]), with growing communities of analysts elsewhere 
in Europe (e.g., [42-44]) and notably in Australia [45-48]. There is also an 
extensive and fascinating literature on modeling HIV/AIDS (e.g., [49, 50]) 
that intersects with injection drug use, but for reasons of space these issues 
are dealt with very briefly here. 

The method employed in this chapter is simply to skim from this literature 
insights that can be communicated effectively without detailed technical 
exposition of the underlying models and analysis, with some bias toward 
results that the author has observed to be compelling to policy makers and 
non-academics. 

The methods employed in the underlying literature are diverse, but mainly 
involve construction of some nonlinear descriptive model of the behavior of 
drug users and sometimes sellers, with inputs corresponding to various 
policy alternatives. Depending on the sophistication of the model and 
associated analysis, the models are then used to reproduce past and present 
behavior and/or to make recommendations for the future, either in “what if’ 
policy simulation mode or through some formal optimization. 
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Collectively, the greatest weakness of the literature is the inability to truly 
validate these models given the paucity of reliable data and the inherently 
small sample size when the unit of analysis is at the national level. A 
consequence is that specific numerical results are not very precise; the 
models are more reliable for general structural insights of the sort offered 
below. The models' greatest contribution stems from precision of a different 
sort, the precision and rigor that comes from translating less quantitative 
scholars’ mental models into equations, from which powerful insights often 
emerge from relatively simple analysis. 

12.2 RESULTS 

A number of insights emerge direcdy from models of use, without explicit 
consideration of specific control measures. We begin with a few such 
insights before considering results from models of prevention, treatment, 
enforcement, and drug epidemics. 

72. 2. 7 Models of use 

Everingham and Rydell [23] made a pioneering contribution to 
understanding of drug policy by developing a simple two-state Markov 
model of cocaine demand that distinguishes between so-called “light” and 
“heavy” users. Figure 12.1 illustrates a modified version of the model with 
flow rates recently updated by Knoll and Zuba [51]. 



Figure 12.1 Everingham and Rydell’s light and heavy user model 
[23] with flow rates updated by Knoll and Zuba [51] 




Several insights emerge from this simple model. For example, most 
(roughly 5/6 th ) of those who try cocaine do not escalate to heavy use, but 
those who do persist in the heavy use state for many years (1/g = 18 years). 
Per capita consumption rates for heavy users are much higher than for light 
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users. 1 As a result, expected lifetime consumption per initiation is on the 
order of 225 - 475 grams [52] - but that figure is dominated by a very large 
expected consumption given escalation to heavy use multiplied by a small 
probability of escalating. Given the high social cost per gram consumed, 2 
this implies that the expected social cost per cocaine use career is very large 
(on the order of $50,000 - $100,000 per initiate). The median cost is at least 
an order of magnitude lower, if not two. Indeed, given that many users 
appear never to proceed beyond the “very light” stage [53], the median cost 
per cocaine use career could be close to 0. 3 

Highly skewed consumption and social cost distributions are not unique to 
cocaine. The average marijuana use career involves 375 - 875 grams of 
consumption, or 1,000 - 2,000 joints, whereas median lifetime days of 
marijuana use are less than 100 [54]. For heroin in the US, per capita 
consumption rates are an order of magnitude lower than for cocaine but the 
social costs per gram are an order of magnitude higher [54] and exit rates 
from dependent use no higher [55]. Thus, the expected social cost for 
someone who escalates to dependent heroin use is at least as high as is the 
corresponding figure for cocaine. Indeed, inasmuch as light use of heroin is 
more likely to involve smoking and heavy use to involve injecting, the skew 
in social cost could be greater than the skew in social cost for cocaine. 4 

The sharply different exit rates for light and heavy users (factor of 5 
difference in Figure 12.1) and the lag between initiation and escalation to 
heavy use means that the character of drug use can vary sharply over the 
course of a drug epidemic, as illustrated in Figure 12.2 [51]. Demand for 
cocaine 5 rose sharply with cocaine initiation in the mid- to late 1970s, but 
did not fall when initiation fell in the 1980s. Rather, sharp declines in the 
number of light users were offset by numerically smaller increases in the 



1 Everingham and Rydell suggested a ratio of 7.25 to 1 . More recently, Abt analysts 
estimated that “chronic” users spend about 6 times as much per capita as do 
“occasional” users [94]. 

2 Rydell and Everingham [24] originally estimated average social costs of about 
$100 per gram, but Caulkins et al. [54] use newer evidence to develop a figure of 
$215 per gram. 

3 Kaya et al.’s [48] analysis of Australian heroin use data shows that the number of 
people quitting heroin use over time is highly correlated with the number initiating, 
presumably because the modal career of use is very short. 

4 A tendency for light users to smoke and heavy users to inject heroin would reduce 
the skew in terms of grams of heroin consumed because injection is the more 
“efficient” route of administration, although dependent users are also more likely to 
have developed tolerance and to take larger effective doses. 

5 Demand is proxied by the sum of light and heavy users weighted by their relative 
propensities to consume. 
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number of heavy users, whose per capita consumption rates are much higher, 
leaving overall demand substantially stable for close to two decades. The 
proportion of that demand attributable to light vs. heavy users changed 
dramatically, however. In 1980 much of the demand came from light users 
who are not badly affected by their drug use. By 1990 most of the demand 
came from heavy users, many of whose lives are dominated by the drug. 
This picture follows fairly directly from the simple Markov model of use, 
yet it is so compelling that the Office of National Drug Control Policy 
incorporated [23] an earlier version of it in several of its National Drug 
Strategy Reports (e.g., [56]). 

Figure 12.2 Evolution of cocaine demand in the US, expressed in 
millions of light users or their equivalent, assuming heavy users 
consume seven times as much per capita as light users 




These figures also provide a convenient way of thinking about various drug 
control interventions. Primary prevention programs seek to reduce 
initiation. Treatment seeks to increase quitting from heavy use. 

Enforcement programs seek to do both and also reduce per capita 
consumption by current users by deterring users and constraining supply. 
Epidemic models analyze how the relative effectiveness of these 
interventions varies over the course of drug epidemics such as the one 
depicted in Figure 12.2. We turn next to a discussion of key insights relating 
to these programs. 
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12.2.2 Reducing demand through prevention 

There has been great confidence that drug prevention is effective and cost 
effective. For example, the 1999 national drug strategy [57] stated 
unequivocally that, “The simplest and most cost-effective way to lower the 
human and societal costs of drug abuse is to prevent it in the first place.” 

However, there is enormous heterogeneity in programs, ranging from 
adventure camps to mass media campaigns. Some are more effective than 
others [58]. Experimental trials have shown some school-based programs to 
decrease illicit drug use [59-61], yet the most popular school-based program, 
the Drug Abuse Resistance Education or DARE program, has not been 
shown to have any material effect on marijuana use [62]. 

Furthermore, the experimental evidence pertains only to self-reported use of 
indicator substances (e.g., marijuana), through final followup data collection 
(typically 9 th or 12 th grade) by people in the program. However, from a 
policy perspective one is interested in the impact on actual (not self- 
reported) use of the more damaging illicit drugs (e.g., cocaine) over the 
lifetime of all people affected, including those not in the program. 

Caulkins et al. [52, 54] developed mathematical models for projecting total 
impact of school-based prevention programs based on available evidence 
concerning “best-practice” programs. There is considerable uncertainty 
concerning the projections, but the bottom-line finding is that these 
programs are cost effective, though not very effective. 

Drug prevention is not very effective if one compares it to conventional 
childhood vaccination. If one gives the very best prevention program to a 
group of youths who would have used drugs, most will go ahead and use 
drugs anyhow. Even cutting edge school-based programs only reduce 
marijuana use by 5-15%, and for almost all programs those effects decay by 
the end of high school. Even recognizing that delayed initiation is 
associated with lower lifetime use, this translates into reductions in the 
present value of lifetime consumption in the single digits, and most likely 
just a few percent, because reductions in lifetime use are only one-fifth to 
one-third as great as the reductions observed immediately following program 
completion [54]. 

Thus prevention cannot be “the” solution to the drug problem. Indeed, the 
notion that enforcement merely needs to “hold the line” until prevention can 
“cut the legs out from under the epidemic” does not seem realistic given that 
the problem is now more endemic than epidemic. It is similarly unrealistic 
to hope, as some drug legalization advocates suggest, that funding drug 
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prevention with the money saved by not having to enforce the prohibition 
would offset any legalization-induced increase in use. 

On the other hand, prevention is cheap, even if one recognizes that the 
dominant societal cost of running school-based prevention programs is the 
opportunity cost of not using class time to teach other academic subjects. 
The outright budgetary costs for programs delivered by regular teachers who 
are already on the payroll is tiny. Since preventing drug use is so valuable 
and prevention so inexpensive, prevention is cost effective even though it is 
not very effective. 

One “paradox” of prevention that an OR/MS analysis reveals is that only 
about one-quarter of a prevention program’s impact on cocaine use comes 
from preventing program participants from initiating use [52]. Some impact 
comes from reduced use by program participants who do initiate and use at 
some level. Still only about one-third of the reduction in consumption is in 
the form of reduced consumption by program participants. Two-fifths 
comes from positive spillover to friends and associates of those in the 
program, and one-fifth comes about because reduced use by all these people 
shrinks the market, making enforcement against those who remain in the 
market more effective. Thus, conventional evaluations of prevention that 
focus on abstinence for program participants miss some two-thirds to three- 
quarters of the effects that would be assessed by a systems analysis. 

Another interesting insight is that “school-based prevention should be done 
15 years before one knows we need to do prevention” [52]. The average age 
of initiation of “hard” drugs is about eight years after the age targeted by 
school-based prevention programs. National recognition of a drug epidemic 
may occur five or more years after the peak in initiation. (US cocaine 
initiation peaked in 1979; it was recognized as a national crisis around 
1984.). Since it takes time to appropriate funds, adapt and scale up 
prevention programs, and so forth, this implies that school-based prevention 
must be started about 15 years before it is widely appreciated that prevention 
is needed! Given the contagious nature of drug epidemics, prevention 
programs implemented before the beginning of an epidemic are likely to be 
many times more effective than programs implemented after the epidemic 
has matured. 6 Since ability to forecast drug trends is exceedingly limited, 
the practical implication is that prevention programs should be funded on an 



6 Precise statements are difficult because effectiveness ratios are sensitive to the 
number of heavy users at the beginning of the epidemic, a parameter for which data 
are particularly weak. See Winkler et al. [110] for more on how the relative 
effectiveness of different types of drug prevention varies over the course of an 
epidemic. 



DRUG POLICY 307 



ongoing basis, not in response to current crises. Decisions about prevention 
should not be made only with an eye toward ameliorating the current 
epidemic. Instead, prevention should be seen as “lending a hand” in reducing 
the current drug epidemic and possibly other undesirable social trends, while 
also serving as a form of inexpensive insurance against possible future 
epidemics. 

Likewise, drug prevention should be - and usually is - generic, not drug- 
specific. Indeed, less than half of the social benefits of school-based drug 
prevention stem from reduced use of illicit drugs. The majority stems from 
reductions in smoking and heavy drinking [54]. 

12.2.3 Reducing demand through treatment 

Treatment is the most thoroughly evaluated drug control intervention. 
Indeed, the literature is so large there is even a bibliography just of other 
literature reviews of drug abuse treatment effectiveness [63]. Most 
observers conclude that drug abuse treatment is cost effective (e.g., [24, 64]). 
The Institute of Medicine [65] summarized the literature by saying, 
“Research has shown that drug abuse treatment is both effective and cost 
effective in reducing not only drug consumption but also the associated 
health and social consequences.” On the other hand, a National Research 
Council Report [66] subsequently attacked the existing data on treatment as 
vulnerable to various methodological biases, concluding that, “There is little 
firm basis for estimating the benefit-cost ratio or relative cost effectiveness 
of treatment.” The principal complaint was that few true randomized 
controlled trials had been conducted. 

What is clear to a systems analyst, though not necessarily a social scientist, 
is that decision-relevant insight can be gleaned even if it is not possible to 
produce a bottom line benefit-cost ratio. For example, one can work 
backwards to ask how effective treatment must be to be cost-justified. If the 
resulting breakeven effectiveness seems implausibly high, one would be 
skeptical that treatment is a good investment. If it seems attainable, one 
might be more optimistic. 

Rydell and Everingham [24] in fact performed such exercises. One of their 
striking findings was that even if every treatment client relapsed 
immediately after completing treatment, treatment could still be cost 
effective! The full model is too involved to explain here. It tracks cocaine 
as it is produced and passed through multiple distribution layers, and 
explicitly models user flows, prices, and market dynamics over a 15-year 
planning horizon. 
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Back-of-lhe-envelope calculations are sufficient to convey the same basic 
insight, as we now demonstrate. In terms of the drug use model above, 
treatment can be thought of as doing three things: it can suppress use while 
the person is in treatment, it can reduce use between exit from treatment and 
relapse, and occasionally it may encourage some users to quit permanently. 
Rydell and Everingham’s startling insight is that the first mechanism alone 
can be enough to make treatment a good investment. 

Rydell and Everingham [24] estimated that the average admission to 
treatment costs about $2,000, the average time in treatment is 3 months, and 
use is suppressed by about 80% during treatment. If heavy users consume at 
a rate of 120 grams per year, the average admission averts about 120 * 
(3/12) * 0.8 = 24 grams of consumption through this “incapacitation” effect 
alone. Harwood et al.’s [8] cost of illness study estimated that the total 
social cost of illicit drugs in the US in 1992 (excluding impaired 
productivity) was $83.5 billion. Apportioning this by substance, dividing by 
Rydell and Everingham’s [24] estimate of 291 metric tons of cocaine 
consumption in 1992, and adjusting for inflation, Caulkins et al. [54] roughly 
estimate an average social cost of $215 per gram of cocaine consumed in the 
US. So Rydell and Everingham’s implied social benefit per treatment 
admission (24 grams * $2 15/gram = ~$5,000) exceeds the roughly $2,000 
cost. 7 Indeed, the benefit-cost ratio would be greater than one even if every 
user relapsed immediately after leaving treatment and treatment only 
suppressed use by one-third during treatment (120 * (3/12) * (1/3) * $215 > 
$2,000). 

One can do a similar breakeven calculation with respect to treatment’s 
impact on exit rates. Suppose the present value of the residual career length 
of the average treatment entrant is 8 years. (In Figure 12.1 ’s Markov model 
the undiscounted residual career length would be 1/g = 1/0.055 = 18 years, 
but one should discount back to the present and truncate to recognize that 
people - especially chronic drug users - do not live forever.) If the social 
cost per year of use is approximately 120 grams/year times $2 15/grams = 
$25,000 per year, then the discounted social value of averting a present value 
of 8 years of such use by getting a heavy user to quit is about $200,000. 
Hence, if even 1% of treatment admissions led to permanent cessation, the 
present value of treatment’s benefits would equal its costs. 



7 Of course the social cost per gram of consumption averted by treatment could in 
theory be below average cost, but more likely it is higher. The biggest danger from 
light use of cocaine is the possibility of escalation to dependent use, and since many 
ofthose in treatment are “referred” by the criminal justice system, consequences of 
their use may be costly even relative to those of other heavy users. 





DRUG POLICY 309 



Similarly, if one in 12 people entering treatment ceases use for a year (and 
no one quits permanendy and no one else reduces use during treatment) the 
benefits would exceed the costs. Any linear combination of these three 
effects would also lead to a breakeven benefit-cost ratio. For example if one 
client in 20 did not relapse for a year and one in 250 quit, treatment’s 
benefits would exceed its costs even if the treatment had no impact 
whatsoever on 95% of clients. 

Pollack [67] has taken this insight a step further, noting that methadone 
maintenance (a treatment for heroin users, who often use by injection) can 
have benefits that exceed its costs even if it gets no credit at all for reducing 
drug use - simply because it can reduce the rate at which users spread HIV 
by sharing syringes. 

More generally, interventions can reduce drug-related harm and have 
positive social benefit-cost ratios even if they do not reduce drug use. 
Indeed, treatment is sometimes described as a “hook” for getting needy 
people in contact with health and social service agencies. Such a “harm 
reduction” approach to drug control is common outside the US [68], 
although it has not been the subject of much formal systems analysis. 

As Manski et al. [69] argue, in the absence of rigorous randomized 
controlled trials it is not possible to conclude with certainty that treatment is 
cost effective, but what is clear from Rydell and Everingham and other’s 
work is that the breakeven effectiveness values are not very high and that 
relapse rates are not an adequate metric for evaluating the value of treatment. 
Hence, Manski et al.’s complaint that, “When complete and permanent 
abstinence is used as a criterion of success, between 60 and 90 percent of 
clients relapse to drug use within 12 months of treatment,” [69] does not 
seem altogether damning. 

The work of Rydell and Everingham also provides a cautionary note. If 
most people relapse, then unless those individuals can be re-enrolled rapidly, 
there is a limit to how quickly treatment can ameliorate the drug problem. In 
Rydell and Everingham’ s model (which assumed that 13.2% of treatment 
entrants left heavy use because of that treatment, with two-thirds merely de- 
escalating to light use), even if every heavy cocaine user received treatment 
once a year, cocaine use would still only be cut in half over 15 years. 
Furthermore, Rydell and Everingham did not consider the possibility that 
such an expansion in treatment might have an adverse feedback effect on 
initiation, as do Behrens et al. [70, 71]; such an effect would make programs 
less effective. Highly imperfect treatment programs, no matter how cost 
effective, cannot quickly eliminate an endemic drug problem. Everingham 
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and Rydell [24] and Caulkins et al. [52] make similar points concerning 
prevention. 

12.2.4 Reducing supply 

Interventions can affect supply in two ways. Unanticipated interventions 
can disrupt the market equilibrium. Ideally the disruption takes the form of 
physical shortage, and the market does not regenerate, but that is not the 
norm. Usually suppliers adapt, although prices may spike and use decline in 
the interim [72]. At one time or another over the last 30 years, four different 
regions have been the principal supplier of heroin to the US (Mexico, South 
America, Southwest Asia, and Southeast Asia). Similarly, Colombia quickly 
replaced Mexico as the principal supplier of marijuana to the US in response 
to paraquat spraying and fears of adverse health-effects of using sprayed 
marijuana [73]. 

Enforcement can also affect supply even if the intervention is fully 
anticipated. For example, if smugglers knew that one-quarter of all 
shipments would be seized, they would ship more than if they thought none 
would be seized. Indeed one of the early lessons that drug policy analysis 
gave policy makers was that quantity seized is not a direct measure of 
enforcement’s impact on consumption [72]. However, presumably 
smugglers would charge more per kilogram landed to make up for then- 
losses. The higher prices represent a shift in supply that affects retail prices 
and, hence, consumption. 

At one time demand was thought to be insensitive to price, but the price 
elasticity of demand for illicit drugs turns out to be rather high, much higher 
than for cigarettes. (For a review of the literature, see Chaloupka and Pacula 
[74]). Nor does this price-responsiveness seem to be confined to light use 
reported in surveys. Crane et al. [75] estimate that the elasticity of cocaine 
emergency department mentions with respect to price is -0.63, and Caulkins 
[76] notes that a simple constant elasticity model predicts emergency 
department mentions for both cocaine (elasticity -1.3) and heroin (elasticity 
- 0 . 8 ). 

These disequilibrium and equilibrium aspects of enforcement’s effect on 
supply are quite distinct, and great confusion can arise if one tries to 
compare analyses or conclusions concerning one with those concerning 
another. Supply-side interventions are most likely to have disequilibrium 
effects if they quickly affect a large proportion of supply. For most drugs, 
the industry within US borders is populated by many vertically 
disaggregated “firms,” so it is difficult for enforcement to remove a large 
proportion of the national domestic distribution network’s capacity at any 
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one time [77]. Furthermore, the network is robust because of its many 
lateral linkages, independent paths, and ability to expand quickly the 
capacity of individual arcs [78]. 

Interventions in source countries can have greater potential for market 
disruption because there is greater market concentration there. Perhaps the 
greatest success occurred when the Turkish Opium ban, the breaking of the 
“French Connection’' case, and Mexican opium eradication substantially 
drove up purity-adjusted heroin prices during the mid- to late 1970’s, before 
Asian heroin filled the gap [79]. The greatest success in disrupting the 
cocaine supply was the result of a combination of US efforts and the “war” 
between the Colombian government and the Medellin-based traffickers in 
1989 which led to a sharp (50-100% at its peak) but short-lived (about 18 
months) increase in cocaine prices [80]. In 1995, Peruvian interdiction of the 
“air bridge” to Colombia led to a smaller but identifiable increase in cocaine 
prices [75]. 

There is reason to believe that transient price increases can have meaningful 
effects. The heroin scarcity in the 1970s coincided with the ebbing of the 
heroin epidemic [81]. Emergency room and medical examiner mentions 
declined in parallel with higher cocaine prices in 1989-1990 [82], and there 
was a one-period (three month) decline in emergency mentions in late 1995 
[83]. Some, however, argue that market disruptions can increase harms 
through unsafe use (e.g., more needle sharing) and greater market violence 
[84, 85]. 

There have been only a few analyses of the consequences of short-term 
disruptions (e.g., [75, 86]) and no serious estimates of the cost of generating 
disruptions. Hence, few real cost-effectiveness insights exist. This is clearly 
an area worthy of further research. 

There have been far more studies of how enforcement might affect the long- 
run market equilibrium. Such analyses use so-called “risks and prices” 
calculations of the sort pioneered by Reuter and Kleiman [87]. The “risks 
and prices” paradigm recognizes that increasing enforcement risks for 
dealers raises their cost of doing business. Dealers could simply absorb 
those costs, but presumably prefer to pass them along to users in the form of 
higher retail prices, which in turn reduce consumption [88]. 

The literature on risks and prices calculations generates a number of insights. 
For example, when efficiency is defined as kilograms seized per million 
taxpayer dollars spent, enforcement is more efficient at seizing drugs in 
source countries and while drugs are being smuggled into the US than within 
the US. However, suppliers are also more “efficient” at replacing drugs that 
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are seized before they enter the US because the drugs are so much less 
expensive in the source and transshipment countries. Unfortunately, when 
moving upstream in the supply chain, the “efficiency” gain for the suppliers 
trumps the efficiency gain for the interdictors. Hence, the effective cost to 
suppliers of replacing the drugs seized per million taxpayer dollars spent is 
lower, not higher, outside the US. Replacing seized drugs is just one of 
many components of the “tax” that enforcement imposes on equilibrium 
operations. For example, Rydell and Everingham’s [24] model also 
considers seizure of assets, arrest, imprisonment, incarceration of sellers who 
are also users, and indirect effects on production costs. However, even when 
considering all components of the tax, the same basic pattern persists. The 
cost imposed on suppliers per million taxpayer dollars spent on enforcement 
is lower outside the US than it is within the US. 

Hence, the only way international operations can be a more cost-effective 
“tax” on suppliers is if the tax is “multiplied” as the drugs move down the 
distribution chain. Boyum [89] and Caulkins [90] suggest reasons why there 
might be such multiplicative price transmission. Caulkins [80] finds some 
evidence for this proposition, but DeSimone [91] suggests variation by drug. 
It may be easier to create transitory disruptions through international 
operations, but unless a multiplicative price transmission model holds, it is 
harder for such enforcement to drive up equilibrium prices [79]. 

Within US borders, the risks and prices model has something of the feel of 
an arm’s race. If enforcement can impose enough cost on the suppliers per 
taxpayer dollar spent, it could be cost effective. Most analyses find that it is 
cosdy to fight this arms race in a mature market, as an excerpt from a simple 
static portion of Caulkins et al.’s [92] model suggests. 

Assume that the demand curve can be locally linearized with a known 
elasticity T\, that the market is in equilibrium in the sense that suppliers’ 
revenues just cover costs, including normal profits, and that the industry 
supply curve stems from the following cost structure. “Normal” business 
costs per unit increase linearly in volume (i.e., they follow a textbook 
upwardly sloping linear supply curve), but there are two additional costs: (1) 
costs imposed by enforcement, including compensation for the risks of arrest 
and imprisonment, and (2) costs that are linear in the dollar value of the 
drugs distributed, not their weight. The last term is important because drug 
distribution is almost pure brokerage activity, requiring minimal processing, 
and the drugs weigh next to nothing per unit value. (Cocaine and heroin sell 
at retail for about ten and one hundred times their weight in gold, 
respectively.) Thus the suppliers’ costs of delivering drugs can be written as 



Total cost = (co + Ci Q) Q + E + y (P Q), 
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where P and Q are the market clearing price and quantity, E is the 
enforcement “tax”, and Co, Cj, and y are positive constants. With a little 
algebra [53] it is easy to show that shifts in demand and the enforcement tax 
have the following effects on the market equilibrium: 
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where (Xo = Co / P 0 , Oti = C| Q 0 / Po> and p = Eo / Po Qo are the fractions of 
dealers’ costs in the current equilibrium that are attributable to the linear part 
of the cost term above, the quadratic part of the cost term, and enforcement, 
respectively. Since y is the remaining fraction, cto +0t| + P + y = 1, so one 
of these parameters (o^) can be eliminated in the expressions above. 
Caulkins et al. [93] estimate that for the US cocaine industry in 1992, T| = 
-1, Oo = 0.55, Cti = 0, y= 0.25, and p = 0.2. Suppose these parameters still 
applied in 2000, when the Office of National Drug Control Policy [94] 
estimated that there were 3.035 million occasional and 2.707 chronic cocaine 
users who collectively spent $35.3 billion while consuming 259 metric tons 
of cocaine. 

Reducing equilibrium consumption by 1% would require imposing costs of 
(cti - p - (1 - y)/T|) * 1% * $35.3 billion = $194 million on suppliers. The 
cost to taxpayers to “purchase” this cost-imposition depends on how 
efficient enforcement is. Consider a policy of giving longer sentences to 
people who already would have been convicted and incarcerated at least 
briefly. (Thus we can ignore details of arrest, adjudication, seizures, and so 
forth.) Suppose drug suppliers have to increase workers’ wages by $50,000 
to compensate them for the risk of each additional expected year of 
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incarceration. It costs taxpayers about $25,000 to incarcerate someone for a 
year [95], so the efficiency ratio is 2:1 and taxpayers could buy that 1% 
reduction in cocaine consumption for $97 million per year. 

Alternately, one could cut consumption by 1% by reducing demand by 1 + T] 
(P — ai) / (1 - y) = 0.73%. Assuming heavy users consume seven times as 
much per capita as do light users, that would require eliminating 0.73% * 
(2.707 + 3.035/7) - 23,000 heavy users. At first this might seem to be the 
more expensive route: $97 million would only pay for about two treatment 
admissions per person for 23,000 heavy users. However, the supply 
reduction strategy requires spending $97 million per year indefinitely. If the 
23,000 heavy users were somehow removed by treating each twice, 
consumption would be reduced by 1% indefinitely (ignoring indirect effects 
on initiation, which may be a second-order effect in a mature market). At a 
4% discount rate, the present value of $97 million per year forever is $2.4 
billion, or about $100,000 for each of those 23,000 users, enough for some 
two-dozen rounds of treatment per person. 

There is a sharp distinction between the timing of the costs and benefits of 
treatment, conventional enforcement, and extending time served for 
convicted traffickers with mandatory minimum sentences [93]. Raising 
prices by threatening sanctions brings immediate benefits, since suppliers 
have to adjust their cost structure in the short run. Secondary, long-lasting 
benefits also accrue: raising prices today suppresses initiation and increases 
quitting thereby reducing future demand. So supply-side enforcement’s 
benefits are predominantly upfront. The costs of enforcement with 
conventional sentences also occur mainly in the first year or two, but the 
longer the sentence, the longer the period over which costs to taxpayers are 
spread. Furthermore, if the policy change is one that extends the sentence of 
someone who would have been incarcerated anyhow, the incremental costs 
do not begin to be felt until after the end of the baseline sentence. The time 
profile of treatment costs and benefits is very different. Treatment costs 
essentially all come in the first year, as do the “incapacitation” benefits of 
reduced use during treatment. However, the benefits of convincing someone 
to quit continue to accrue throughout the entire period during which they 
would otherwise have continued to consume. Informally, conventional 
enforcement is like paying cash, mandatory minimum sentences are like 
buying with a credit card, and treatment is like an investment. 

Rydell and Everingham [24] and Caulkins et al. [93] examine in detail this 
issue of the timing of the benefits and costs of various interventions. 
Roughly speaking, the result is as follows. Suppose a treatment intervention 
and an enforcement intervention each have the same impact on consumption 
over the next 15 years, discounting future outcomes at 4% per year. (More 
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specifically, imagine the enforcement operation is one whose effect stems 
from raising suppliers’ cost of operations over the next year.) The treatment 
intervention would have about double the impact on consumption in the first 
year as it would in each succeeding year, whereas the ratio for enforcement 
is about nine to one. The enforcement intervention would have about 2.7 
times as great an impact on consumption in the first year as does treatment, 
whereas in every succeeding year the treatment program would have 1.65 
times as much impact as the enforcement program. Hence, although 
treatment may be the more effective way to reduce use in the long run, 
enforcement has greater capacity to focus its benefits in the present, a 
capability that may be invaluable when trying to interrupt the contagious 
spread of initiation early in a drug epidemic. 

One concern with price-raising enforcement is that it might increase crime 
even if it reduces use. Most drug-related crime is “economic-compulsive” 
(committed to obtain money to buy drugs) or “systemic” (arising from drug 
selling, e.g., punishment for non-payment) and so is driven by drug dollars, 
not by intoxication or use per se (also called “psychopharmacological” drug- 
related crime). Depending on the elasticity of demand, driving up prices 
could actually increase, not decrease, drug-related crime. A very simple 
model of this conveys the basic intuition. Suppose that drug-related crime is 
proportional to a weighted sum of drug use and spending on drugs, with the 
latter accounting for 100x% of the total. So drug-related crime C equals 



C = k| P Q + k 2 Q, 

for some positive constants k| and k 2 such that k| P Q = x (kj P Q + k 2 Q), 
i.e., ki P / k 2 = x/(l-x). Taking the derivative of crime with respect to price 
gives 
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Hence, driving up prices reduces drug-related crime if the absolute value of 
the elasticity of demand (|r||) is greater than x, the proportion of drug-related 
crime that is driven by drug spending rather than drug use. 
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Figure 12.3 Relative effectiveness of demand reduction and price- 
raising enforcement depends on the elasticity of demand 
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Figure 12.3 uses the market equilibrium model above to illustrate in more 
detail how the effects of price-raising enforcement and demand reduction on 
drug use, spending, and crime depend on the elasticity of demand. 8 



12.2.5 Dynamic/ epidemic modeling results 

Drug use varies dramatically over time, driven in no small part by 
endogenous nonlinear dynamics, not just in response to changes in policy or 
exogenous factors such as the poverty rate. Hence, one would expect the 
effectiveness of interventions to likewise vary with the state of the epidemic, 
and a growing literature investigates this possibility. According to this 
school of thought, it is rarely sensible to make statements such as “treatment 
is better than enforcement” or vice versa without qualifying the statement 
(e.g., “treatment is better than enforcement for controlling cocaine use in the 
US now that the epidemic has plateaued”). 



8 Parameters from Caulkins et al. [92] and assuming 5/6 ,h of drug-related crime is 
driven by spending. The enforcement intervention is imposing $1 million in costs on 
suppliers. The demand reduction intervention is eliminating 100 heavy users. 
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Perhaps the most important endogenous dynamic is the “contagious” 
character of drug initiation. Unlike infectious diseases, drug use has no 
pathogen, but drug use is contagious in the sense that drug use spreads when 
non-users are introduced to the drug by current users. (Contrary to once 
popular myth, most initiation does not stem from dealers “pushing” the 
drugs on potential users; rather, new users are initiated by current users.) In 
formal terms, there is a positive feedback from current use to initiation. 
Systems with such a feedback can grow explosively. 

There are several models of how that explosive growth ends, depending in 
part on what country and drug is being modeled. In a line of modeling 
pioneered by Tragler [95-98], a steady state emerges when quitting at a 
constant per capita rate balances initiation, which is an increasing but 
concave function of use. 

In a line of models pioneered by Behrens [70, 71, 99] the key negative 
feedback pertains to the drug’s reputation. As some early initiates progress 
from light to heavy use, the drug’s dangers become apparent and initiation 
declines. That decline, coupled with the high quit rates for light users, 
increases the ratio of heavy to light users, further enhancing the drug’s 
negative reputation and cutting initiation. These models can, for some 
parameter values, generate recurrent cycles of drug epidemics. Almeder 
[100] examined a related family of age-distributed models in which the 
nature and intensity of this feedback depends on the relative and absolute 
ages of the users and potential users. 

In a line of models associated with Rossi and colleagues (e.g., [42, 43]), the 
limiting factor is the number of susceptibles. To over-simplify, essentially 
everyone who might try the drug ends up trying it. Most use only briefly, 
but some get hooked, so after the explosive growth stage there is a decline to 
an endemic problem characterized by a high proportion of heavy users. 

The overall policy prescription from these models is to rely on enforcement 
early in a drug epidemic and rely on treatment later in the epidemic. 
Prevention can be extraordinarily cost effective if done before and at the 
beginning of an epidemic; later it is much less effective, but is still worth 
doing. In particular, keeping prices high initially is a useful way to slow the 
explosive spread of drug use, but later on high prices are costly to maintain 
and may exacerbate drug-related crime. More generally, one should initially 
fight very aggressively to contain a drug epidemic. Ideally the epidemic 
would be eradicated or stabilized at low levels, but if the intervention is too 
late or the epidemic growth too great, then one should accommodate the 
growth in drug use by gradually shifting to strategies that remove heavy 
users and/or ameliorate the social cost per heavy user. 
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Tragler et al. [96] offer an example of such a finding. Their model seeks the 
optimal dynamic levels of price-raising enforcement and drug treatment that 
together minimize the present value of the sum of control spending plus the 
quantity of drugs consumed, weighted by the social cost per unit of 
consumption, subject to drug use evolving according to the following 
nonlinear model of drug use (modeled by a set of differential equations): 
Initiation is concave in use. “Natural” quitting is at a constant rate per 
capita, which can be augmented by treatment, although with diminishing 
efficiency as the proportion of users in treatment increases. Prices affect all 
flow rates and are in turn a function of the intensity of price-raising 
enforcement as above. 

Figure 12.4 updates a figure from Tragler et al. [96], using a slightly larger 
exponent on endogenous initiation in light of Grosslichf s [101] findings. 
The horizontal axis depicts the number of users and the vertical axis gives 
the optimal annual control spending (in thousands of dollars). A so-called 
Dechert-Nishimura-Skiba threshold (labeled Adns) occurs when the number 
of users is about 1.3 million. If the initial number of users is less than this 
threshold value (i.e., control begins before the epidemic has passed this 
point), the optimal strategy is to use massive levels of enforcement and 
treatment to reduce use to some minimal level. Otherwise, it is optimal to let 
use grow toward a positive equilibrium. In that case, enforcement and 
treatment spending should increase with use, but with the proportion of 
control spending allocated to treatment increasing over time. This finding of 
a sharp choice between eradication and accommodation at the aggregate 
level is consistent with others’ analyses of the impact of enforcement on 
local drug markets (e.g., [36, 40, 41, 102]). 

A key driver of this dynamic is “enforcement swamping” [103]. The 
deterrent or price-raising potential depends on enforcement’s intensity - i.e., 
the amount of enforcement per kilogram or per person in the market - not 
the absolute level of enforcement. Early in an epidemic, when the market is 
small, it is not so hard to achieve high enforcement intensity. When the 
market doubles in size, the intensity generated by a given enforcement level 
is halved because that enforcement is spread over a larger target. Since drug 
use can much more than double over an epidemic, overcoming this dilution 
for an established mass-market drug is very expensive. 

One of the more interesting insights to emerge from these optimal control 
models comes from Behrens et al.’s [70, 99, 104] complementary analysis of 
prevention and treatment. It extends Everingham and Rydell’s [23] model of 
cocaine use in Figure 12.1 to make initiation increasing in the number of 
light users and decreasing in the number of heavy users. Insights derived 
from this model include the following: (1) Prevention is most valuable when 
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Figure 12.4 Optimal control spending as a function of the number of 
users, illustrating Tragler et al.’s finding that if control catches the 
epidemic early it should seek to “eradicate” the epidemic; otherwise 
accommodation is the optimal strategy 
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there are relatively few heavy users, such as in the beginning of an epidemic. 
Treatment is more effective later. (2) The transition period when it is optimal 
to use both prevention and treatment is very brief. (3) Total social costs 
increase dramatically if control is delayed. 

The second insight is particularly interesting because many people describe 
the strategic drug policy choice as concerning supply-side vs. demand-side 
interventions. Behrens et al. show that it is misleading to lump together 
treatment and prevention even though they both affect demand. At any 
given point in an epidemic, prevention might be very valuable but not 
treatment or vice versa. Indeed, when Behrens et al.’s model is 
parameterized for the US cocaine epidemic and school-based prevention 
(which has a roughly 8-year lag between program spending and effect on 
initiation), it is literally never optimal to spend money on both prevention 
and treatment! This is illustrated in Figure 12.5, which is adapted from 
Behrens et al. [70]. A complete absence of overlap is not robust with respect 
to parameter variation, and as discussed above, prevention is probably 
justified on an ongoing basis because of its impact on the use of other drugs. 
Nevertheless, the general message is robust: It is simplistic to argue for or 
against “demand-side” (or “supply-side”) strategies without knowing more 
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about the specific mix of strategies and the current state of the epidemic in 
question. 



Figure 12.5 Optimal cocaine control spending levels overtime for 
school-based prevention and treatment for the past US cocaine 

epidemic 




12.3 OPPORTUNITIES FOR FURTHER RESEARCH 

Drug policy is an important domain. It has enough nonlinearities from 
epidemic feedback and market dynamics to challenge unguided intuition, so 
formal mathematical models such as those reviewed here can be a very 
important aid to strategic planning. There remains, however, far more that is 
not known than is known, so possibilities for further research are great. 
Many of the present generation of models are highly stylized. It is important 
to discover what current findings are robust and what new findings emerge 
as the models are expanded to consider more factors and interactions. 

For example, most of the current models consider a single drug or an 
undefined amalgam of all drugs. However, drugs interact with one another 
in many ways. At the individual level, drugs interact in users’ bodies so that 
drugs taken in combination can lead to overdose even when larger doses of 
each drug singly would not. At the level of a drug use career, use of one 
drug can affect use of others, both in the narrow economic sense of being 
consumption substitutes or complements, and in the broader social sense, 
e.g., when use of one substance brings an individual into contact with users 
and sellers of other drugs. Interactions also occur at the market level: for 
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example, the presence of established distribution networks for one drug (e.g., 
Colombian cocaine) can facilitate the spread of another drug (e.g., 
Colombian heroin), and control resources devoted to one drug may not be 
available for another. 

Dependent users often have multiple medical conditions. Many are “dually 
diagnosed” with mental health and substance abuse disorders. Many are 
infected with HIV or Hepatitis C. Complicated interactions can occur in 
treatment regimens (how successful will dependent users be in complying 
with complicated HIV control regimens?; see Turner et al. [105]) and 
treatment financing (cost containment pressures may encourage restrictions 
on drug treatment, but resumption of drug use can increase other health care 
costs in the long run; see Sturm et al. [106] and Sturm and Pacula [107]). In 
some ways it makes more sense to think about the cost effectiveness of drug 
treatment relative to the cost effectiveness of other medical interventions 
than it does to compare drug treatment to criminal justice or prevention 
interventions. 

Drug policy intersects not only with health and crime, but also social policy 
more generally [108]. For example, the issues of the dually diagnosed are 
particularly problematic for those who are also homeless [109]. Models that 
disaggregate types of users (e.g., homeless vs. other) and evaluate 
interventions tailored to one subpopulation or another would refine current 
understanding ofbroad strategic themes. 

Perhaps the greatest need, though, is for more fundamental understanding of 
how drug epidemics evolve. This is perhaps best gained by modeling more 
epidemics, both at lower levels of geographic aggregation (e.g., in individual 
cities within the US) and in other countries. Comparative studies across 
drugs, cultures, political structures, and market conditions would help clarify 
what aspects of epidemic dynamics are fundamental and which are 
idiosyncratic to a particular context. A defining characteristic of nonlinear 
systems is that the magnitude of the response to a given intervention is 
nonlinear. Sometimes the response is less than proportionate; sometimes it 
is much more. Historically, drug control interventions have often produced 
less than hoped for effects. It may be that all these interventions are 
inherently ineffective or have been poorly conceived or executed. An 
alternative explanation, however, is that they simply have not been “timed” 
or “tuned” appropriately because the nonlinear character of the underlying 
epidemics has not been fully appreciated. In this alternate, more optimistic 
view, advances in understanding of drug epidemics will not only help us to 
choose the best among a range of interventions which may all have mediocre 
performance, but also to enhance the effectiveness of all interventions. 
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SUMMARY 

In this chapter we discuss recent operations research advances in modeling 
drug treatment programs for injection drug users, in particular maintenance 
treatment programs for opioid addicts. We focus on four main questions for 
which operations research techniques have proven beneficial: How effective 
are opioid maintenance programs? Do the benefits of methadone 
maintenance treatment justify its costs? Are alternative forms of maintenance 
treatment cost effective? If opioid maintenance treatment programs are 
expanded, how many new treatment slots are needed? We discuss a number 
of methodological issues and highlight directions for future research. 
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13.1 INTRODUCTION 

Injection drug use is a significant public health problem in the United States. 
Between 750,000 [1] and 12 million [2] individuals in the U.S. are injection 
drug users (IDUs). Injection drug use is a major risk factor for human 
immunodeficiency virus (HIV). The prevalence of HIV among IDUs 
exceeds 40% in some cities [3]. Approximately 20-25% of all new HIV 
cases and 35% of all acquired immune deficiency syndrome (AIDS) cases in 
the United States have injection drug use as a risk factor [4J. IDUs may 
serve as a “core group” [5J in the HIV epidemic and spread HIV to non-IDUs 
through sexual contact. In addition to HIV, IDUs are subject to a number of 
other comorbidities including hepatitis, tuberculosis [6], overdose and 
accidental death [7], and have mortality rates that are up to 60 times greater 
than those of other members of their age group [8]. IDUs may make greater 
use of emergency health services and less use of regular health services, and 
have annual health care expenditures that are three to four times greater than 
those of other members of their age group [9, 10]. Injection drug use is also 
associated with increased criminal activity and increased costs to the criminal 
justice and welfare systems [11, 12]. 

Methadone was developed in Germany during World War II as a substitute 
for morphine and has been used in the treatment of heroin addiction for more 
than 30 years [13]. Methadone has a slow onset and a long delay, with 
effects lasting up to 24 hours, and can be taken once a day to curb heroin 
withdrawal symptoms. Methadone maintenance treatment (MMT) is 
associated with reduced illicit drug use, reduced HIV risk behavior, and 
reduced drug and property-related criminal activity [14, 15]. Non-HIV 
health care expenditures are lower for IDUs in MMT than for IDUs not in 
MMT [16]. A meta-analysis of studies comparing IDUs in MMT versus 
IDUs not in MMT found a relative risk of death of 0.24 - 0.43 associated 
with MMT [17]. 

Methadone is a highly regulated substance in the U.S. [18] and is classified 
as a Schedule II narcotic (meaning that it has a high potential for abuse) by 
the U.S. Drug Enforcement Administration [19]. There are only 
approximately 115,000 methadone treatment slots in the U.S., or roughly 
enough for 10-20% of all IDUs nationwide [18]. Some states do not have 
any methadone programs [20]. Methadone is typically administered daily 
under supervised settings to prevent potential abuse of the drug. This and 
other regulations contribute to the high cost of MMT. Estimates of the 
annual cost of one methadone treatment slot range from $4,300 [21] to 
$5,250 [22] (in 19% dollars). However, methadone is a generic drug that 
costs less than $1 per day [23]. One study found that drug costs accounted 
for only 5-6% of the total cost of a methadone treatment slot [21]. 




336 OPERATIONS RESEARCH AND HEALTH CARE 



MMT is controversial in the United States. The former U.S. “Drug Czar” 
(Director of the Office of National Drug Control Policy) publicly supported 
increased funding and use of methadone as part of a drug abuse reduction 
strategy [24]. A report from the National Institutes on Drug Abuse has 
advocated expanded methadone capacity [25], and drug abuse prevention has 
recently been included among the principles for HIV prevention among IDUs 
[26]. However, support for MMT is not universal. In 1999, a bill was 
introduced in the U.S. Congress that called for limiting methadone funding 
and access [27]. In 1996, the mayor of New York City declared that 
methadone was immoral and represented the substitution of one drug 
(methadone) for another (heroin) [28]. He subsequently reversed his position 
and devoted $5 million to increased city funding of methadone programs 
[29]. 

This chapter reviews modeling work that evaluates programs to treat opioid 
dependence. We focus on four major questions where operations research 
techniques have provided insight into the value of such programs: 

1. How effective are opioid maintenance programs? 

2. Do the benefits of MMT justify its costs? 

3. Are alternative forms of maintenance treatment cost effective? 

4. If opioid maintenance treatment programs are expanded, how many 
new slots are needed? 

Following the discussion of the questions we highlight some methodological 
issues and describe a number of promising areas for future research. 

13.2 MODELS OF OPIOID MAINTENANCE PROGRAMS 

13.2.1 How effective are opioid maintenance programs? 

MMT programs may not lead to a complete cessation of drug use but, rather, 
a reduction in usage; similarly, they may not lead to complete cessation of 
needle sharing. Additionally, many IDUs lead unstructured lives, leading to 
substantial difficulties in fulfilling the follow-up and monitoring 
requirements of many studies. Statistical techniques that only record 
“success” or “failure”, as well as those that do not handle large amounts of 
missing or censored data, may not be suited to the assessment of opioid 
maintenance programs. Thus the motivation to develop new techniques. 

Lee [30] and Weng [31] developed models to assess the effectiveness of 
methadone and buprenorphine maintenance programs in the presence of 
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missing observations. Buprenorphine is an alternative to methadone that has 
only recently been approved for maintenance treatment in the U.S. Both 
models were fit using data from a 17-week randomized clinical trial to 
evaluate the effectiveness of buprenorphine [32]. Patients in the trial were 
randomized into three groups: Group 1 received 8 mg of buprenorphine 
daily; Group 2 received 20 mg of methadone daily; and Group 3 received 60 
mg of methadone daily. Patients in each group were asked to provide urine 
samples three times per week to assess their drug consumption while in 
treatment. Between 60% and 80% of the members of each group were lost to 
followup, and approximately 18% of urine samples among those not lost to 
followup were missed in each group. 

Weng [31] developed a stochastic compartmental model with three 
compartments representing negative urinalysis (Nj(t)), positive urinalysis 
(N 2 (t)), and missed test (N 3 (t)). [31] The model was formulated as a 
continuous-time stochastic process, as depicted in Figure 13.1. Clinical trials 
data was used to estimate flow rates between states for the three study 
groups. Transitions between any two states were allowed, and the population 
was assumed to be closed. The 17- week period was partitioned into four or 
five segments for each group. The steady state probability of being in the 
negative state was found to be between .403 and .566 for Group 1 
(buprenorphine); between .183 and .353 for Group 2 (20 mg methadone); and 
between .271 and .465 for Group 3 (60 mg methadone). 

Lee [30] used a two-state discrete-time Markov chain to examine the 
effectiveness of methadone and buprenorphine programs. The discrete time 
steps corresponded to urine sample collection points. The two states, 
denoted by 0 and 1, represented negative and positive urinalysis results, 
respectively. Estimation procedures were developed to estimate the 
transition probabilities between the two states given a sequence of urinalysis 
results that may contain missing observations. Maximum likelihood 
estimates for PI, the probability of opiate use during the 17- week period, 
were calculated. In one set of calculations, it was found that PI = 0.4734 for 
Group 1 (buprenorphine), PI = 0.6288 for Group 2 (20 mg methadone), and 
PI = 0.4970 for Group 3 (60 mg methadone). In a second set of calculations 
it was found that PI was 0.3664, 0.6260, and 0.4854 for the three groups, 
respectively. 
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Figure 13.1 Compartmental model to investigate the effectiveness 
of methadone and buprenorphine treatment 
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The methods developed by Lee and Weng [30, 31] address some of the 
difficulties in assessing the effectiveness of opioid maintenance programs. 
However, there are still opportunities for new methods. There may be many 
definitions of “success” related not only to drug use but also to the frequency 
of engaging in risky behavior and frequency of use of drugs other than 
heroin. Also, IDUs may simultaneously use several drug treatment services 
(i.e. MMT, counseling, support groups). Future methods may seek to 
address multiple definitions of success as well as the incremental impact of 
each service. 

13.2.2 Do the benefits of MMT justify its costs? 

Drug treatment programs, including MMT, are often seen as primarily 
benefiting one group (IDUs) while being paid for by another group 
(taxpayers). This discordance has generated much interest in understanding 
whether the benefits of MMT justify its costs. Much of the analysis of this 
question has utilized cost-benefit analysis (CBA) or cost-effectiveness 
analysis (CEA). Recent debate has questioned whether CBA and CEA can 
be considered equivalent [33-37]. Applications of these techniques in the 
evaluation of opiate treatment programs clearly are not equivalent: 
researchers performing CBA tend to focus on the social impact of drug 
treatment (such as crime, judicial costs, social welfare, etc), whereas 
researchers performing CEA tend to focus on the health impacts of drug 
treatment (including mortality, comorbidities, and HIV infection). 











MAINTENANCE TREATMENT FOR OPIATE ADDICTION 339 



Both techniques often make use of the quality-adjusted life year of survival 
(QALY) as a way of characterizing the health benefits of a program [38J. 
QALYs represent utilities for health states and are scaled between 0, 
representing death, and 1, representing perfect health. 

Cost benefit analysis of methadone programs In cost-benefit analysis, all 
costs and benefits of a proposed program are converted into monetary units. 
For a health care program, this requires conversion of health outcomes into 
monetary units. The requirement that health outcomes be explicitly valued is 
often cited as a criticism of CBA in the evaluation of health care programs. 
Results of a CBA are typically expressed in several different formats, 
including the net benefits approach (net benefits = total benefits - total costs) 
and the benefit-to-cost ratio. It has been argued that CBA may be preferable 
to CEA for drug abuse interventions since many of the benefits (e.g., 
improved employment, reduced criminal activity) are not health related [39]. 

An early cost benefit analysis of methadone programs was provided by 
Hannan [40]. The analysis focused on four direct impacts of methadone 
treatment: decreased criminal justice expenditures, decreased health care 
costs for heroin-related conditions, decreased expenditures on heroin, and 
increased legal earnings among those treated. The monetary value of 
property theft crimes was not included in the analysis since it is a transfer of 
wealth with no net impact. The analysis was based on data from a New York 
MMT program in 1965. For a six-year time horizon, benefit-to-cost ratios of 
1.47 to 4.40 were found, depending on which benefits were included in the 
analysis. For a projected 33-year time horizon, benefit-to-cost ratios of 1.86 
to 5.09 were found. In all cases considered, the benefits were substantially 
greater than the costs. A limitation of this study was that health care costs 
were included, but the health benefits of MMT were not. 

French and colleagues described a methodology for conducting benefit-cost 
analysis of methadone programs [39, 41]. The methodology involves 
converting scores from a disease severity index into QALYs, and then 
multiplying QALY estimates by the societal willingness to pay (WTP) for a 
QALY to yield the monetary value of health outcomes. Health benefits are 
converted into monetary outcomes using the following formula: 

health benefits = — ^ ( 1 - 24 ) x ( $ / QALD) ( 1 ) 

N i - 1 

where N = 19 (corresponding to 19 comorbidities that are common among 
IDUs), QAj is the quality-of-life adjustment for condition i, and Q ALD is the 
societal WTP for one quality-adjusted life day [41]. The value ofQALD was 
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$173.08, derived from an estimate of the value of life [42]. The net cost is 
computed by adding the other monetary costs to the health costs. This CBA 
methodology is illustrated with sample calculations [39] and with a full CBA 
based on data from the Philadelphia Target Cities Project [41]. 

A similar methodology was used to conduct a benefit-cost analysis of two 
levels of intensity of addiction service, denoted by “partial continuum” (PC) 
and “full continuum” (FC), in Washington State [43]. Costs included those 
related to health care, psychiatric status, employment status, drug and alcohol 
use, and legal status. Health benefits were derived in part by converting 
changes in the Addiction Severity Index [44] for treated patients into 
monetary units. FC only, PC only, and FC and PC together had average net 
benefits of $17,833, $11,173, and $15,305, respectively (expressed in 1997 
dollars) and respective benefit-cost ratios of 9.70, 23.33, and 14.87. 

The conversion of health benefits to monetary units is a necessary part 
of any CBA, but formula (1) has some shortcomings. The formula does not 
address the possibility that some comorbidities are more common among 
IDUs than others. Dividing by N (N=19 in the example given) implicitly 
assumes a prevalence of 1/N (5.3% when N = 19) for all comorbidities, and 
that there is no correlation between different comorbidities. The method also 
does not quantify the impact that drug treatment has on reducing the 
probability of developing one of the comorbid conditions. Also, QALYs 
represent utilities, and it is unclear if scores from a disease severity index can 
be converted to utilities. 

The cost effectiveness of methadone Cost-effectiveness analysis involves 
calculation of the incremental cost-effectiveness ratio, which is defined as the 
incremental costs of an intervention divided by the incremental health 
benefits of the intervention [45]. The incremental costs of an intervention 
include the cost of the intervention itself plus the costs associated with all 
future changes in health caused by the intervention. The incremental cost 
term may include non-health care costs if a societal perspective is taken. 
Costs and benefits are typically discounted to reflect the principle that costs 
in the future are preferred to costs today, and benefits today are preferred to 
benefits in the future. 

If the health benefits of the intervention are expressed in terms of QALYs, 
then a CEA may be referred to as a cost-utility analysis (CUA). Conversion 
of the health benefits of interventions into QALYs allows interventions that 
yield very different benefits to be compared. By expressing results as a ratio, 
CEA avoids having to explicitly assign a monetary value to health outcomes. 
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Barnett constructed a life table to examine the costs, benefits, and cost 
effectiveness of MMT [46J. Age-specific mortality rates for non-IDUs (i.e., 
individuals who do not inject drugs) were obtained from U.S. life tables. 
Numerous studies have compared mortality rates for IDUs in and out of 
MMT versus those of the general population. For instance, a study in 
Sweden found that IDUs in MMT had 12 times the annual mortality rate of 
individuals in their age group, and IDUs not enrolled in MMT had 63 times 
the mortality rate of individuals in their age group [8]. Age-specific 
mortality rates for IDUs in and out of MMT were obtained by multiplying 
the age-specific rates for the general population by these relative risk rates 
applicable to IDUs. 

Survival until age 65 for an initial cohort of 1,000 25-year old IDUs was 
calculated using the estimated age-specific mortality rates. This was 
compared to survival of a similar cohort that was assumed to have access to 
MMT. It was assumed that 57.5% of patients in the MMT group received 
methadone and hence receive the survival advantage associated with 
methadone. Total discounted life years of survival attained and the total 
costs associated with MMT for each group were determined. The cohort that 
received methadone experienced 8,704 additional discounted life years of 
survival at an incremental cost of $51,486,000, resulting in a cost- 
effectiveness ratio of $5,915 per life year gained. Extensive one-way 
sensitivity analyses revealed cost-effectiveness ratios between $3,300 and 
$9,100 per life year gained. 

Kahn and colleagues analyzed a number of HIV prevention programs 
including methadone maintenance [47J. In their analysis of MMT they 
constructed two scenarios representing cities with different drug and HIV 
epidemics. They assessed the impact over five years of a one-year expansion 
in MMT capacity. They found that the extra MMT capacity had a cost- 
effectiveness ratio of $48,000 to $60,000 per undiscounted life year gained. 
The analysis considered only the impact of MMT on the spread of HIV and 
did not consider other health care costs. 

A compartmental model of methadone maintenance Zaric et al. 
developed a compartmental epidemic model to evaluate the cost 
effectiveness of expanded methadone treatment capacity on a population of 
IDUs and non-IDUs [48, 49). The work was motivated by the difficulty that 
other evaluation techniques have had in quantifying the impact of expanded 
methadone on the spread of HIV. Characterizing the impact of new 
treatment programs on the spread of HIV is important because HIV has a 
significant impact on total health care costs and mortality. 




342 OPERATIONS RESEARCH AND HEALTH CARE 



A model was developed in which the population was divided into nine 
compartments based on behavior (IDU, IDU in methadone, or non-IDU) and 
disease status (not infected with HIV, HIV-infected, and AIDS). The model 
is illustrated in Figure 13.2. The arrows in Figure 13.2 represent transitions 
between behavior and risk classes; these transitions were modeled using a 
system of differential equations. All individuals enter the model as 
uninfected 18-year-old non-IDUs. They remain in that state until they die or 
age out of the model, or they become HIV infected, or they become IDUs. 
IDUs can become infected through sexual or needle-sharing contacts with 
infected individuals, and non-IDUs can only become infected through sexual 
contact. Non-IDUs were assumed to become IDUs at a fixed rate. IDUs 
could remain as IDUs, they can re-enter the non-IDU population, or they can 
enter MMT slots as space became available. IDUs in MMT can leave MMT 
at any time and enter either the IDU or non-IDU population. 



Figure 13.2 Compartmental model to examine the impact of MMT 

on HIV 




The model was used to dynamically calculate new infections and rates of 
entry into treatment. Let Xj(t) be the number of individuals in compartment i 
at time t, i = 1,...,9, and let N be the total number of MMT slots available. 
One constraint ensured that the MMT slots were always filled and a second 
constraint ensured that new entrants to MMT were drawn from each disease 
state according to prevalence in the population. 

A challenging aspect of the model formulation was defining a mixing model 
for a population with two types of risk behavior (sexual mixing and needle 
sharing), two levels of sexual risk (with and without condoms), differing 
rates of participation in the two risk activities, and like-with-like sexual 
preferences [50, 51]. The mixing model was defined by first specifying a 
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term for the number of new infections. The number of new infections among 
members of compartment i is given by: 

M,(') = *,(')X i= 1,4,7 (2) 

where Yj(t), py H (t), and (}y L (t) are the sufficient contact rates between 
compartments i and j for injection, high-risk sexual contact (sexual contact in 
which a condom is not used), and less-risky sexual contact (sexual contact in 
which a condom is used), respectively. 

For risky injections, the probability of a contact between members of 
compartments i and j was assumed to be proportional to the total number of 
injections by members of those two compartments. That is, the probability 
that an individual has a contact with a member of compartment j is the total 
number of injections by all members of compartment j divided by the total 
number of injections by all members of all compartments. This probability 
changes over time since the number of people in each compartment changes 
over time. Thus, the sufficient contact rate for injections, Yy(t), was given by 
the number of injections per person in compartment i multiplied by the 
probability of a contact between compartments i and j multiplied by the 
probability of disease transmission for a contact between compartments i and 

j- 

The formulas for the sufficient contact rates for sexual mixing (|}y H and jJy L ) 
were similar to those for shared injections but modified somewhat to account 
for the presence of two types of risky contacts, different rates of condom use 
between IDUs and non-IDUs, and preferential mixing. Let G be the 
proportion of sexual contacts that IDUs have with other IDUs. Let Pi be the 
average annual number of new sexual partners, and let P R be the average 
number of new sexual partners of risk R, R = L, H, among members of 
compartment i. Then Pj L = dj x Pj S , and Pj H = (l-di)xPj S , where dj is the 
probability that an individual in compartment i uses a condom. Let ce be the 
risk reduction achieved by using a condom [52], and let Xy L and Ty H be the 
probabilities of HIV transmission through sexual contacts of type L and H, 
respectively, where Xy L = ce*ty H . Let My R (t)= [rriii R (t)] be the mixing matrix 
for sexual contacts of type R, R = L, H. Then my R (t) is given as follows: 
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(3) 

Each expression in (3) has two terms. The first is the probability that an 
individual from compartment i has a contact with an individual from the 
group of compartments indexed by j. The second is the conditional 
probability that a contact is with someone from compartment j given that 
there is a contact with someone from the specified group of compartments. 
We explain these values for the first and third expressions in nijj R (t) below. 

For i = 1,...,6, j = 1,...,6, the expression for mjj R (t) is the probability that an 
IDU has a contact with another IDU multiplied by the probability that the 
contact is with someone from compartment j given that it is with someone 
from compartments For i = 7,...,9, j = the expression for 

niij R (t) contains two terms. The first is the probability that a non-IDU has a 

contact with an IDU. Since IDUs have (l — G) 6 | X, (t)tf total contacts 

with non-IDUs, non-IDUs must have the same total number of contacts with 
IDUs in order for the total number of sexual contacts to balance. Thus, the 
first term is the proportion of total contacts by non-IDUs that are with IDUs, 
and is interpreted as the probability that a non-IDU has a contact with an 
IDU. The second term is the probability that the contact is with a member of 
compartment j given that the contact is with an IDU. Following from the 
above discussion, the sufficient contact rates for sexual mixing at time t are 
thus given by 

PH(t) = P l K m? J (t) T* R = H, L, i = 1,...,9, j = 1,...,9 (4) 

The number of new infections among members of compartment i, given by 
(2), is found by multiplying the number of individuals in compartment i by 




MAINTENANCE TREATMENT FOR OPIATE ADDICTION 345 



their rate of sufficient contacts with member of compartment j for each type 
of contact. 

Scenarios representing regions with HIV prevalence among IDUs of 5%, 
10%, 20%, and 40% were simulated. Total costs, QALYs, and new 
infections, as well as the incremental cost effectiveness ratio (ICER), were 
calculated for a 10 year time horizon. An expansion of MMT by 10% of 
current capacity (i.e., increasing the proportion of IDUs enrolled in MMT 
from 15% to 16.5%) was analyzed in each scenario. In the 5% scenario, the 
expansion of methadone capacity would result in cost-effectiveness ratio of 
$10,900 per QALY gained. In the 40% scenario, the expansion of 
methadone capacity would result in a cost-effectiveness ratio of $8,200 per 
QALY gained. These cost-effectiveness ratios compare favorably to a 
number of HIV prevention and treatment programs [48, 53]. In the 5% 
scenario, approximately 36% of HIV infections averted and 71% of QALYs 
gained accrued to non-IDUs. In the 40% scenario, approximately 28% of 
infections averted and 58% of QALYs gained accrued to non-IDUs. Thus, 
substantial health benefits of MMT programs accrue to non-IDUs. 

A number of sensitivity analyses were performed to consider the cost 
effectiveness of increased methadone capacity if the newly created slots were 
less effective and/or more costly than the existing slots. New MMT slots 
may be less effective than existing slots if new recruits are less motivated to 
change their behavior than those already in MMT. New MMT slots may be 
more expensive than existing slots if there are additional costs associated 
with outreach to fill the new slots. If all new slots are half as effective and 
twice as costly as existing slots, expanded MMT capacity had a cost- 
effectiveness ratio of $36,100 in the 5% scenario and $38,300 in the 40% 
scenario. 

This study found MMT to be cost effective based on commonly accepted 
standards, under a wide range of assumptions. An important conclusion was 
that MMT could be cost effective even if it did not lead to a complete 
cessation of risky injections. Some factors, such as crime and changes in 
employment among IDUs, were omitted from the analysis. Inclusion of 
these factors would likely lead to more favorable conclusions regarding the 
cost effectiveness of methadone. 

13.2.3 Are alternative forms of maintenance treatment cost effective? 

Buprenorphine is a potential alternative to methadone and may be useful for 
expanding treatment capacity. Buprenorphine is subject to a different 
regulatory environment than methadone. Methadone is listed as a Schedule 
II drug by the U.S. Drug Enforcement Administration, while buprenorphine 
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is a Schedule V drug (having low potential for abuse and accepted medical 
uses) [19]. Compared to methadone, buprenorphine is safer in overdose, has 
lower abuse potential, and fewer withdrawal symptoms when discontinued 
[54]. To curtail abuse through injection , buprenorphine can be taken orally 
and mixed with naltrexone or naloxone, both of which have unpleasant 
effects if injected but are relatively harmless when taken orally [54, 55]. The 
ability to mix buprenorphine with naloxone makes buprenorphine potentially 
attractive in the development of take-home or prescription maintenance 
formulations. 

Barnett et al. [56] modified the model of Zaric et al [48, 49] to evaluate the 
cost effectiveness of buprenorphine maintenance treatment. The model of 
MMT cost effectiveness was modified to account for observed differences in 
the effectiveness of methadone versus buprenorphine as well as likely cost 
differences between the two products. 

A meta-analysis of trials comparing buprenorphine to methadone found that 
patients maintained on buprenorphine had 8.3% more positive urinalyses and 
a 26% higher dropout rate than patients in methadone [57]. One analysis of 
the economic impact of a potential take-home formulation of buprenorphine 
with naloxone concluded that the buprenorphine formula would cost between 
81% and 113% as much as methadone when patient travel time was not 
included, and 44% to 76% as much as methadone when patient travel time 
was included [58]. Barnett et al. estimated that a take-home formulation of 
buprenorphine and naloxone would cost between $5 and $30 per day, 
corresponding to annual costs of $5733 to $14,858 [56]. 

Buprenorphine may be preferred to methadone by some IDUs. Thus, some 
newly created buprenorphine treatment slots may be filled by patients 
formerly in MMT. Let f M and f B be the efficacy of methadone and 
buprenorphine slots in reducing risky behavior and let f AV E be the average 
efficacy of all treatment slots. Let Nm be the initial number of methadone 
slots, and let Nb be the number of newly created buprenorphine slots. Let p 
be the proportion of new slots filled by individuals who switch from MMT. 
Adding N B buprenorphine slots results in a net expansion of capacity of (1- 
p)N B slots. The average efficacy of all slots is defined as: 

f — + ft* ( ~~ P^B ) 

N B + N M -pN B 

The reduction in sharing, change in mortality rates, and dropout rates for the 
treatment compartments, consisting of both MMT and buprenorphine 
patients, reflected weighted averages given by (5). 
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For the case where there is no switching (p = 0), additional buprenorphine 
slots equal to 10% of current MMT capacity would have an incremental cost- 
effectiveness ratio of $14,000 per QALY gained if buprenorphine cost $5 per 
dose, ranging up to $44,200 per QALY gained if buprenorphine cost $30 per 
dose. If half of all new slots are occupied by individuals switching from 
MMT (p =1/2), then buprenorphine costs $17,700 per QALY gained at $5 
per dose, and $84,700 per QALY gained at $30 per dose. Extensive 
sensitivity analysis was done on quality of life and the benefits of treatment 
on quality of life. 

In all cases buprenorphine was found to be less cost effective than 
methadone. However, buprenorphine is still cost effective compared to a 
number of other medical interventions. Additionally, buprenorphine has 
fewer regulatory impediments and may represent an option for expansion of 
drug treatment programs where methadone is not an option. 

A number of issues have been raised regarding the study by Barnett et al. 
[59-61]. Reductions in crime are often cited as the major benefit of drug 
treatment programs, but crime was not considered in the model. The analysis 
was done from the perspective of a health payer who may not be concerned 
with reductions in crime. However, government policy makers may be 
concerned about such costs [60]. The use of QALYs as an outcome measure 
has also been questioned for an intervention that is not seen exclusively as a 
health care program and that has substantial non-health benefits [59, 61]. 

Wall and Pollack [62] adapted the model of Zaric et al. [48, 49] to evaluate 
several drug treatment expansion strategies involving buprenorphine. They 
assumed that the effectiveness of existing and new treatment slots was a 
function of the size of the daily dose of methadone or buprenorphine, 
consistent with evidence that dose size and treatment efficacy may be related 
[63]. They considered a number of strategies involving increasing the 
methadone dosage of existing slots, converting existing slots to 
buprenorphine, and expansion with methadone and buprenorphine at varying 
dosage levels. 

Increasing methadone dosage for existing slots was found to be cost saving, 
while switching all existing slots to buprenorphine was a dominated strategy 
(i.e., more expensive and less effective than another strategy). Expanding 
capacity with methadone was found to be very cost effective. The 
methadone-only strategies all had cost-effectiveness ratios of less than 
$4,000 per QALY gained. Expanding with a mix of methadone and 
buprenorphine was found to have a cost-effectiveness ratio of less than 
$30,000 per QALY gained, and expanding capacity with buprenorphine only 
was a dominated strategy. 
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Additional studies of buprenorphine are warranted given its recent approval 
for use in the U.S [64]. A number of alternatives to methadone and 
buprenorphine treatment exist and have not been subject to the kind of 
modeling described in this chapter. For instance, L-alpha-acetylmethadol 
(LAAM) may be used to control opiate addiction. Detoxification programs 
combined with intensive social services may also be valuable. Rapid 
detoxification (so-called “rapid detox”) may represent an alternative to 
treatment MMT [65-67]. Different treatment modalities have emerged, 
including prescription or take-home formulations of methadone and 
buprenorphine. Prescription buprenorphine is available in France [68] and 
prescription methadone is available in the United Kingdom [69]. 
Prescription heroin has also been proposed by some [70]. All of these 
options merit consideration in future investigations. 

13.2.4 If opioid maintenance treatment programs are expanded, how many 
new slots are needed? 

Lengthy waiting lists for drug treatment have been documented in many 
places [71]. Some jurisdictions require new entrants to MMT to be HIV 
infected or to have tried another drug treatment program (e.g., detoxification) 
unsuccessfully. An important question is how many treatment slots would be 
needed to eliminate or reduce treatment queues. Another important issue is 
the impact of extra capacity on waiting list performance measures. 

Ideas from queuing theory have been used to predict capacity requirements 
for treatment programs such that individuals can receive “treatment on 
demand” [72]. “Treatment on demand” was defined as having no wait for 
treatment once treatment was requested. In the model, N customers arrive 
seeking treatment, a proportion R1 remain on the list and enter treatment, and 
1-R1 enter the waiting list but do not wait for treatment. 

Kaplan and Johri investigated the impact on drug treatment waiting lists of 
providing additional drug treatment capacity [73]. The model was not 
intended to represent any particular drug treatment program but rather to 
represent a general drug treatment model. (Experience with treatment on 
demand in San Francisco has been described elsewhere [74].) Kaplan and 
Johri examined operational outcomes including queue lengths, waiting times 
to enter treatment, and service levels. The service level was defined as the 
proportion of those requesting treatment who remained on the waiting list 
long enough to be admitted into treatment. Numerous factors contribute to 
the inability of drug users to remain on waiting lists until a treatment slot 
becomes available, such as a loss of interest in treatment, arrest, or inability 
of the treatment facility to place the person at the front of the queue. 




MAINTENANCE TREATMENT FOR OPIATE ADDICTION 349 



Kaplan and John [73] developed a model in which drug users can be in one 
of four states at any time: abstinent (not in treatment and not using drugs); 
not in treatment and using drugs but not waiting for treatment; waiting for 
treatment and using drugs; and in treatment and not using drugs. Let a(t) be 
the number of drug users who are abstinent at time t, and let q(t) be the 
number who are waiting for treatment at time t. They modeled a closed 
population (i.e., no new entrants and no departures) of size n, with a constant 
number of treatment slots, s. The model is depicted in Figure 13.3. 



Figure 13.3 Treatment-on-demand model 




Let 6 be the tolerance to wait for treatment; this is the rate at which 
individuals waiting for treatment leave the waiting list. Let p be the rate at 
which treatment is completed. Let V be rate at which abstinent users resume 
drug use. Let a be the rate at which those not in treatment request treatment. 
Let p be the probability of success per treatment episode. The population 
was modeled using the following system of differential equations: 

^P. = s np-ia(l) (6) 

at 

?^=a(n-s-q(t)-a(t))-8q(t)-lls ( 7 ) 

The first equation says that the rate of change of the size of the abstinent 
population is equal to the number of successful completions of drug 
treatment minus the number of abstinent users who resume active drug use. 
The second equation says that the rate of change of the queue length is equal 
to the number of users who request treatment minus the number who drop 
out of the queue minus the number on the queue who enter treatment. 
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Closed-form solutions for a(t) and q(t) can be obtained and used to estimate 
the service level. The number of new slots needed to eliminate queues in the 
long run is given by: 







The value of s is the number of users currently in the queue, in general. 
Thus, the naive approach of adding as many treatment slots as there are 
patients currently in the queue would not be the correct way to eliminate the 
treatment queue, in general. 

The model was illustrated with data from San Francisco. There were n = 
45,000 drug users in San Francisco, s = 6,300 treatment slots, q(t) = 1,400 
currently waiting to enter treatment, and a(t) = 17,600 abstinent drug users. 
Four values for the tolerance to wait for treatment (8) among drug users were 
considered. 

The San Francisco data showed that small increases in waiting times could 
lead to large reductions in service levels. Although s is independent of 8 in 
(8), numerical analysis using the San Francisco data showed that the number 
of slots needed to eliminate queues for treatment in the long run was highly 
dependent on the tolerance to wait for treatment. This is due to the 
relationship between (X and 8 in equilibrium and the methods used to estimate 

а. If the tolerance to wait is one year, then 6,710 treatment slots (less than 

б, 300 + 1,400) would be sufficient to eliminate treatment queues. If the 
tolerance to wait is only one day, then 11,500 slots would be required to 
eliminate treatment queues. However, it could take 22 years or more to 
eliminate the queues using these long-run estimates. For very short tolerance 
to wait, the number of slots needed to immediately eliminate queues could be 
as high as 18,000. 

13.3 METHODOLOGICAL ISSUES AND FUTURE WORK 

Quality-of-life estimates are available for injection drug use and for HIV, but 
currently there is no specific estimate for “IDUs with HIV”. Thus, the 
separate quality-of-life estimates must be combined in some way. It is not 
clear if the aggregate quality-of-life estimate for a compartment representing 
many quality-of-life decrements should be derived through a multiplicative 
model (as in [48, 49, 56]), an additive model (as in [39, 59]), or neither. 
Issues related to combining QALY estimates have been discussed elsewhere 
[75]. 




MAINTENANCE TREATMENT FOR OPIATE ADDICTION 35 1 



Several researchers have noted that IDUs do not mix randomly, but rather 
their injection contact patterns form structured social networks. Studies have 
revealed the structure of IDU social networks in Colorado Springs [76, 77J 
and New York City [78]. The compartmental epidemic models described in 
the previous section assume random mixing, in which each person selects a 
new partner randomly from the entire population. While random mixing in 
compartmental models leads to “worst case” epidemics - that is, epidemics 
with the greatest possible spread of disease among all possible mixing 
patterns [79-81] - it is unclear whether the random mixing assumption 
overestimates or underestimates the incremental impact of drug maintenance 
programs. 

Network epidemic models have been used as an alternative to compartmental 
epidemic models and may be useful if connectivity or network structure is 
important. However, network models are often significantly more complex 
than compartmental models. The threshold conditions for an endemic 
epidemic may be very different for a network model than a compartmental 
model [82]. Watts looked at epidemic spread in static connected networks 
and found that network structure had little impact on eventual epidemic 
outcomes [83]. Zaric directly compared random versus non-random mixing 
in network epidemic models and found that random mixing led to small 
increases in the number of new infections [84]. However, the observed 
difference may be smaller than the range in uncertainty in the parameters of 
the statistical distributions. To our knowledge, no research has yet direcdy 
addressed the question of whether an intervention would appear more or less 
cost effective when evaluated using a network model with nonrandom 
mixing versus a model with random mixing. 

A compartmental model forces all individuals into a finite number of discrete 
compartments, with members of each compartment assumed to be 
homogeneous. In some cases there may be large variations in characteristics 
of members of various groups. Estimates of injection frequency vary from 1- 
3 injections per month [85] to more than 100 per month [86]; estimates of the 
number of new sexual partners also vary over a wide range [87]. Ignoring 
population heterogeneity by using average or representative values may lead 
to systematically biased estimates of outcomes when Markov models are 
used to generate cost-effectiveness ratios [88, 89]. Similar biases may exist 
in compartmental models. 

Pollack noted that the choice of time horizon may be important when 
compartmental epidemic models are used to evaluate the costs and benefits 
of medical interventions [90]. For modest interventions (defined as those for 
which the reduction in the sufficient contact rate is very small) short-term 
incidence analysis would underestimate long-term effectiveness when the 
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equilibrium prevalence is below 50%, and overstate the long-term benefits 
when the equilibrium prevalence is above 50%. These findings may have 
implications for the choice of time horizon for evaluating programs directed 
to IDUs, where prevalence of HIV and hepatitis C may be very high. 

Numerous studies have shown that IDUs who inject cocaine or speedballs 
(cocaine and heroin mixed together) inject far more often than those who 
primarily inject heroin. MMT provides relief from opioid dependence but 
may not have the same impact on cocaine injectors. Some have argued that 
methadone use may actually lead to an increase in cocaine use [91], or that 
cocaine users should not be allowed to enter MMT [92]. Future empirical 
research could look at the impact that MMT has on cocaine injection 
frequency. Future modeling efforts may involve construction of a model 
with separate compartments for IDUs who primarily inject heroin and IDU 
who inject cocaine. 

13.4 CONCLUSIONS 

Much of the debate around drug treatment is concerned with political and 
philosophical issues such as whether MMT is a “moral” way to treat opioid 
dependence. These considerations cannot be ignored in policy formulation 
[61]. Operations research models cannot address such issues. However, OR 
models can be used to identify good policies and to distinguish good policies 
from poor ones. They can also provide methods to facilitate cost- 
effectiveness analysis and to examine the health and economic tradeoffs 
associated with drug abuse treatment programs. Analysis of drug abuse 
treatment programs represents a valuable research area for operations 
researchers in the future, one where OR models can provide value input to 
important public policy questions. 
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SUMMARY 

Operations research has contributed to the control of blood-borne epidemics 
among injection drug users. The analysis of random-mixing models has led 
to a deeper understanding of both syringe exchange programs and substance 
abuse treatment in the control of HIV/AIDS and hepatitis. This chapter 
presents some of these results, and analyzes illustrative models to show how 
simplified, but empirically pertinent mathematical models can assist 
policymakers evaluate public health interventions. 
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14.1 INTRODUCTION 

Public policymakers have tried many approaches to address the social, 
economic, and medical problems associated with substance abuse. For many 
participants in the drug policy debate, “harm reduction” provides the 
touchstone in evaluating the success of these efforts, and a useful alternative 
to simple “use reduction” as a guide for public policy [1]. Harm reduction 
admits diverse meanings among policymakers, clinicians, and academic 
researchers in the drug policy debate. Yet it commands broad support among 
those who seek to balance the competing harms caused by both drug use and 
by public policies to hinder, deter, or punish drug use [2]. 

Harm reduction has proved especially important in evaluating clinical and 
policy responses to injection drug use. Injection drug users (IDUs) have long 
experienced high rates of avoidable mortality and morbidity from infectious 
disease [3]. The most deadly threat now arises from HIV/AIDS. Yet less 
visible infectious diseases, especially hepatitis B and C, endocarditis, and 
tuberculosis also threaten the health and survival of IDUs. Heroin overdose 
provides an additional source of premature mortality and morbidity among 
IDUs [4]. 

Some problems associated with drug use are intimately connected with the 
intensity and the duration of drug consumption among IDUs. For example, 
interventions to reduce property crime by IDUs may fail if they do not 
reduce individual expenditures on illicit drugs. Yet such use reduction is 
sometimes impossible or unnecessary to achieve the desired policy goal [3]. 
Many OECD (Organisation for Economic Cooperation and Development) 
countries have successfully reduced HIV incidence and prevalence among 
IDUs - even within populations that continue to regularly inject heroin or 
other illegal drugs [5]. 

Harm reduction provides the guiding question, though not a clear algorithm, 
to address these concerns. From this perspective, policy analysts, clinicians, 
and policymakers seek to clarify the goals of public policy, and to scrutinize 
the ability of specific policies to advance the well-being of the general 
community and of drug users themselves. 

14.2 CLINICAL AND POLICY RESPONSES 

This chapter focuses on two kinds of interventions often described under the 
rubric of harm reduction: substance abuse treatment (specifically, methadone 
maintenance treatment) and syringe exchange programs. However, to place 
these interventions in context one must consider their place in broader public 
policy. 
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Three kinds of public policy interventions seek to address such threats to life 
and limb among IDUs: supply-side or demand-side law enforcement, harm 
reduction interventions such as syringe exchange for active drug users, and 
substance abuse treatment. Operations researchers have performed important 
policy analysis of all three kinds of intervention. The impact and cost- 
effectiveness of law enforcement efforts are outside the scope of this 
chapter. Because operations researchers have contributed to policy analysis 
of all three kinds of interventions, and because the term “harm reduction” is 
sometimes applied to analyze law enforcement policies, we briefly discuss 
this research. 

Supply- and demand-side law enforcement The most traditional drug 
policy interventions are supply-side law enforcement efforts to deter or 
punish individuals who sell or distribute heroin or other injectable drugs. 
Source-country enforcement activities and border interdiction efforts seek to 
disrupt the organizations and firms involved in drug trafficking. Supply-side 
enforcement also encompasses the arrest of street-level drug users, a subject 
of great complexity given the vagaries of low-wage labor markets for 
potential drug sellers and the high prevalence of substance use and 
dependence among street-level dealers [6, 7]. 

Such efforts seek to contract drug supply, thereby raising market prices of 
illicit substances. In economic terms, these law enforcement policies reduce 
the quantity of drugs supplied at any specific market price - a shift in the 
supply curve - raising market prices and reducing drug consumption in the 
resulting market equilibrium [8]. 

The short-term and long-term effectiveness of interdiction and source 
country policies is influenced by the elasticity of supply for illicit 
substances. If drug suppliers are price-elastic, supply-side law enforcement 
is likely to have a small effect on both prices and consumption. Effective 
interdiction and source-country policies will simply induce new entrants to 
the market. 

The effectiveness of supply-side enforcement is also influenced by the 
responsiveness of IDUs to changes in market prices. If the price-elasticity of 
demand for heroin is less than -1.0, enforcement-linked price increases will 
induce accompanying reductions in both drug consumption and in overall 
expenditures by IDUs. If the quantity consumed is insensitive to market 
prices, price increases will induce only a small decline in heroin 
consumption and will induce an overall increase in overall expenditures for 
heroin among IDUs [9J. 
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Demand-side enforcement measures include penalties for drug possession 
and purchase. Such policies raise the (non-price) costs of illicit substance 
use, and may thereby reduce demand for these substances [10]. Such policies 
are attractive if they deter substance use, and attractive as a mechanism to 
reduce the profits associated with illicit drug sales. Moreover, many IDUs 
commit larceny and other property crimes. Arresting IDUs for simple drug 
possession may therefore be a low-cost means of incapacitating (non-drug) 
criminal offenders [11]. 

An important drawback of enforcement policies is that they impose large 
costs on both individual IDUs and on the wider society. IDUs bear the short- 
term and often lifelong consequences of incarceration or other judicial 
interventions. Taxpayers must finance law enforcement and correctional 
interventions. Moreover, specific law enforcement strategies such as 
aggressive enforcement of illicit drug paraphernalia laws may encourage 
needle-sharing, unsterile discard of used syringes, and other high-risk 
behaviors [12]. 

If substance abuse treatment or prevention interventions can halt, reduce, or 
prevent injection drug use, less punitive alternatives may be preferable to 
criminal justice interventions. Studies by Caulkins and colleagues examine 
the cost-effectiveness (cost per unit of reduced drug consumption) of a wide 
range ofprevention, treatment, and criminal justice system interventions [11, 
13]. Such studies provide strong support for the cost-effectiveness of 
prevention and treatment interventions. 

The price-elasticity of demand for illicit drugs is especially important from 
the perspective of crime control, since many drug users finance their 
consumption through property crime or other illegal activities [14]. 
Operations researchers play an important role in this debate through the 
detailed analysis of illicit drug markets and the relationship between 
interdiction efforts and resulting drug prices [15]. Operations researchers, 
including Caulkins and colleagues, have explored these issues in some 
depth. Using data from the Drug Enforcement Administration’s System to 
Retrieve Information from Drug Evidence (STRIDE) database, these 
researchers have explored regional variations and changing market 
conditions for marijuana, cocaine, heroin, and other illicit drugs [15, 16]. 

Three findings from this literature are noteworthy. 

One striking finding speaks to the difficulties of supply-side enforcement. 
Purity-adjusted illicit drug prices have declined despite substantial supply- 
side interdiction and enforcement efforts [2, 10]. Declining prices and 
increasing purity of street heroin have posed complex challenges for 
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substance abuse treatment, and have fostered non-injecting forms of heroin 
use such as snorting [17]. 

A second finding is that drug users respond to the price of illicit drugs, 
particularly in the long run [9]. This result - predicted by “rational 
addiction” dynamic optimization models of drug consumption - suggests 
that the long-term effect of price decreases is to significantly increase the 
number of illicit drug users. Optimal control theory and other operations 
research methods have been applied, profitably, by health economists and 
others seeking to understand illicit drug markets [18]. 

A third and related insight speaks to the dynamic character of drug markets 
[19]. Patterns of illicit drug consumption changed rather rapidly over the 
1975-2000 period, with current prevalence responsive to past consumption, 
positive and negative “role modeling” by current and past drug users, and 
other feedback effects. Forecasting models have been developed to explore 
these effects. 

Of particular importance are the transitions in drug-using behavior among 
IDUs. The propensity of light or casual users to become heavy users, and 
quit rates among different categories of users powerfully influence the 
number of future IDUs, and influence the likely social harms associated with 
different forms of substance use [20]. 

The remainder of this chapter focuses on the two remaining kinds of harm 
reduction interventions, substance abuse treatment and syringe exchange 
programs. For more information on substance abuse policy, see Kleiman [7], 
MacCoun and Reuter [2], and the collection of essays edited by Heymann 
and Brownsberger [21]. 

Substance abuse treatment includes a broad array of inpatient and 
outpatient medical, psychiatric, and social service interventions designed to 
halt or reduce illicit drug use [4, 22]. This chapter focuses on methadone 
maintenance treatment (MMT), because this is the principal modality used to 
treat injection drug use. Massing [23] describes the history and development 
of MMT. Although many challenges exist to the effectiveness of MMT, 
ranging from inadequate dosing to the difficulties of treating poly-drug use, 
the impact and cost-effectiveness of MMT is well established [24, 25]. 

The value of such substance abuse treatment has been underscored by 
randomized trials of MMT. In one study of Swedish IDUs, 2 of 17 members 
of the non-MMT control group died from apparent overdose. One other 
member of the control group suffered a leg amputation, while two others 
suffered severe infection. Among the remaining controls, two were 
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incarcerated, and 9 of the remaining 10 continued illicit drug use. Over the 
same period, none of the MMT group suffered major health problems, and 
13 of the original 17 were no longer using illicit drugs [26]. Three more 
members of the control group died over the following three years, in a study 
completed before the era of HIV/AIDS [4]. 

Syringe exchange programs (SEPs) are a more pure form of “harm 
reduction.” Although the design and operation of SEPs differ, the common 
aim is to prevent infectious disease transmission among active IDUs through 
the provision of sterile injection equipment and through the safe collection 
of discarded syringes. To focus on the harm reduction dimension of SEPs, 
we assume in this chapter that such interventions have no other impact on 
the frequency of drug use among IDUs, and that SEP has no impact on the 
removal of program clients from the population of active IDUs. Because this 
chapter does not consider the role of SEP as a conduit into MMT or other 
treatment and social services, this is an important oversimplification [27]. A 
fuller treatment would likely indicate greater impact and cost-effectiveness 
of SEPs. 

14.3 THE CONTRIBUTION OF OPERATIONS RESEARCH TO 
POLICY 

Many clinicians and policy makers are skeptical about the merits of analytic 
modeling to scrutinize drug control policies, especially the special problems 
of IDUs. 1 Much of this skepticism arises because of real limitations of the 
data and models available to study this population. IDUs are a hidden 
population whose risk behavior, and even whose absolute numbers, are 
imperfectly known [29]. Basic parameters must be indirectly inferred from 
fragmentary data. Nationally representative surveys provide poor coverage 
of high-risk populations, including IDUs. Data from clinical services such as 
hospital emergency departments or drug treatment programs are based upon 
a self-selected group of patients and may not apply to out-of-treatment IDUs 
[30]. 

The probability of infection with HIV or hepatitis when a susceptible IDU 
uses an infected needle is imperfectly known. Several analyses seek to 
estimate this parameter based upon needle-stick accident data among 
hospital personnel. Other analyses indirectly estimate these probabilities 
based upon observed patterns of disease spread [31]. Neither method is fully 
satisfactory in characterizing risk exposures among IDUs. 



'This section modifies the discussion in Pollack [28]. 
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Observational studies suggest that methadone maintenance treatment 
(MMT) reduces the rate of new HIV infections (HIV incidence) among 
IDUs. MMT clients are less likely than out-of-treatment ID Us to share 
needles, inject drugs less frequently, and are less likely to practice other 
behavioral risks [32J. Several studies document large differences in HIV 
incidence between steady methadone clients and out-of-treatment IDUs [33- 
35]. 

The impact of MMT on hepatitis C (HCV) transmission is less encouraging. 
Like HIV, HCV is spread through sharing of infected injection equipment, 
including syringes, “cookers,” and filters [36, 37]. However, studies of both 
IDUs and health care workers exposed to needle-stick injuries indicate that 
HCV is more efficiently transmitted [38, 39]. 

This high infectivity poses a basic challenge to any prevention intervention 
that seeks to reduce the frequency and duration of injection drug use. From 
an analytic perspective, differences between HIV and HCV underscore the 
difficulties one encounters in evaluating interventions. Individual behavior 
changes and other impact measures may be readily observed. Yet these 
measures are difficult to link with underlying patterns of infectious disease 
spread. Analytic models become essential to make this connection, to clarify 
the value of alternative data sources and measures, and to scrutinize causal 
assumptions that undergird prevention interventions. 

Most sobering are the many IDU populations with low HIV prevalence but 
endemic prevalence of HCV. Pollack and Heimer [40] reviewed published 
literature on HCV prevalence among European IDUs. Although results vary 
across populations, most studies found prevalences between 65-85%. Only 
four of 40 examined studies found HCV prevalence below 50% [40]. Similar 
results are found in studies of MMT clients [41-43], including studies in the 
U.S., Australia, and many places in Western Europe, typically reporting HIV 
prevalence below 10 percent, but HCV prevalence exceeding 70 percent [44- 
46]. Results for young IDUs are somewhat more hopeful [45-48]. However, 
other studies have yielded more disappointing results [49, 50]. 

HCV prevalence comparisons between MMT clients and out-of-treatment 
IDUs have yielded mixed results. Out-of-treatment IDUs are often found to 
have lower HCV prevalences. However, this result may be confounded by 
the older age of the in-treatment population. 

Epidemiological studies and analytic models of syringe exchange programs 
(SEPs) indicate the same contrasts between HIV and HCV prevalence. Many 
studies indicate that SEPs can reduce HIV incidence. Such findings 
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undergird the long-standing support by most public health researchers for 
syringe exchange and similar programs [30, 51J. 

The demonstrated impact of SEPs in preventing HCV transmission is less 
favorable. Using SEP data from Seattle, Hagan and collaborators found no 
protective effects [52J. Theoretical analysis due to Pollack found little 
impact and poor cost-effectiveness of typical SEPs in HCV prevention [53, 
54]. As discussed below, models based upon the short-term impact of 
syringe exchange may greatly overstate long-term SEP effectiveness in 
reducing incidence of highly infectious agents such as HCV. 

Sexual and needle-sharing mixing patterns among IDUs are also poorly 
understood. Models in which IDUs share needles with random partners are 
widely used because random mixing provides a tractable worst-case analysis 
[55-57]. However, social network models are likely to provide a more 
sociologically plausible pattern of infectious disease spread [58, 59]. 

Equally important, rigorous evaluations of specific interventions may not be 
generalizable across populations and settings. Substance abuse treatment and 
SEPs differ greatly in both effectiveness and cost. Such diversity calls into 
question any analysis that draws sweeping comparisons across diverse 
categories of competing interventions [30]. 

Although one must acknowledge reasons for skepticism in applying analytic 
models to policy, such efforts provide important contributions to policy 
debates. Modeling exposes for scrutiny the implicit assumptions that 
policymakers are already using in addressing injection drug use. Public 
policies are often based upon unexamined assumptions that appear 
questionable or implausible when brought to light. 

For example, some clinicians advocate the proliferation of difficult-to-reuse 
syringes to slow HIV spread. Simple but compelling epidemiological models 
indicate that, if the frequency of injection among IDUs is insensitive to the 
supply of new needles, such devices are likely to accelerate infectious 
disease spread [60]. 

As another example, Kaplan and Pollack reviewed procedures used to 
allocate HIV prevention resources [30, 61]. Many U.S. policy makers try to 
allocate resources based upon the number of individuals in each risk group. 
Such an approach is inappropriate when either program effectiveness or HIV 
incidence varies across the pertinent risk groups. 

Worse, the political and organizational realities of group decision processes 
easily foster arbitrary policies and arbitrary allocation of resources. Altman 
and colleagues note (p. 81) that health planners respond to technical and 
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political uncertainty by seeking “convenient proxies for need to be applied in 
allocation decisions” [62]. Wary of debating the merits of specific facilities, 
many health system planners are drawn to elaborate need-assessment 
formulas to evaluate proposed services. 

Such methods provide poor guidance regarding the impact or cost- 
effectiveness of proposed expenditures, but find wide appeal as planners 
seek credible focal points to resolve internal disputes and to justify 
controversial policies. Such approaches are widespread in many areas of 
resource allocation [63]. Brandeau and colleagues and Kaplan discuss more 
rigorous and explicit approaches to allocating scarce resources [61, 64]. 
Explicit modeling helps to discipline group decision processes, and allows 
policymakers to explore the unintended assumptions and consequences of 
appealing but limited algorithms that are widely used to allocate resources. 

Models also help policy makers understand the linkage between the 
available data and the latent causal processes one seeks to influence through 
public intervention. Paltiel and Stinnett describe many ways that analytic 
models can interrogate the premises and likely consequences of policy 
interventions [65]. 

Analytic models clarify the links between readily-measured or readily- 
influenced intermediate outcomes and the ultimate outcomes of direct policy 
concern. Many of the best evaluations of HIV prevention interventions do 
not directly scrutinize HIV incidence among program participants. Rather, 
such evaluations explore the impact of such interventions on important 
behavioral risks [66-68]. Analytic models help establish the linkage between 
these behavioral risks and actual health outcomes. For interventions such as 
syringe exchange that have not been (or cannot be) evaluated through 
prospective randomized trials, analytic models can scrutinize the findings of 
bservational studies of those interventions. 

Analytic models can also identify the kinds of data required for resource 
allocation and for other public health functions. Public health reporting and 
data systems are largely designed to accomplish classic functions of 
epidemiological surveillance such as case enumeration and contact tracing. 
The quality and performance of such systems is traditionally scrutinized 
through such measures as the completeness of case finding and avoidance of 
duplication when the same case is reported multiple times or in multiple 
jurisdictions. 

Although such performance measures are pertinent to the provision of 
medical and other services to all infected individuals, they are sometimes 
misleading when surveillance data are used for other purposes. When 
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allocating a fixed pool of resources across populations and jurisdictions, the 
most important characteristic of HIV or other surveillance systems is their 
ability to provide comparable and unbiased estimates of prevalence and 
incidence across populations. Researchers are beginning to apply more 
explicit scrutiny to the process of funding allocation, and are examining how 
different approaches to centralized resource allocation influence resource 
allocation across competing jurisdictions [69]. 

Techniques such as sensitivity analysis can also support the reliability and 
robustness of even highly simplified or empirically uncertain models in 
providing policy guidance. For example, research on nonrandom mixing and 
random graph theory highlights the value of random mixing models in 
characterizing infectious disease transmission within high-risk populations 
[70]. 

Operations researchers also help direct the attention of policymakers and 
analysts to critical concerns that might be overlooked in the absence of 
formal analytical models. As an example, infectious disease prevention 
measures are typically based upon disease prevalence across the population. 
Prevalence is easily measured using existing clinical data systems when 
infected individuals reliably seek medical attention. Moreover, prevalence- 
based allocation is often the best strategy to allocate treatment resources 
across competing populations and interventions. However, in a changing 
epidemic, current prevalence may provide poor guidance about the specific 
risk groups that are currently experiencing the highest rate of new infection. 
HCV incidence analysis indicates that young and inexperienced IDUs are 
experiencing high rates of new infection [71]. For HIV, incidence-based 
resource allocation is likely to channel greater resources to nonwhites and to 
residents of southeastern states [30, 72]. 

Analytic techniques can also demonstrate how interventions that are 
effective for one problem are likely to be much less effective in addressing 
related problems in a different setting. As discussed below, simple analytic 
models help researchers and policymakers to establish the success of SEPs 
in slowing HIV spread. Short-term reduction in HIV transmission is 
sufficient to reduce long-run incidence and prevalence because the HIV 
virus, though deadly, is inefficiently transmitted in each individual act of 
needle sharing between infected and uninfected persons. However, 
infectious disease transmission models indicate that similar-quality SEP 
interventions are less effective in the prevention of HCV than in the 
prevention of HIV [53, 54, 73, 74]. This has been observed in many IDU 
populations, which display endemic HCV prevalence despite well- 
implemented prevention interventions that successfully maintain low HIV 
prevalence [42, 43]. 
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14.4 AN ANALYTIC MODEL OF BLOOD-BORNE DISEASE 
AMONG IDUs 

Basic insights can be illustrated using a simple but useful epidemiological 
model of infectious disease transmission among IDUs. This section presents 
an analytic framework that examines the long-term and short-term impact of 
both MMT and SEPs in reducing infectious disease spread. This model 
focuses on the cost-effectiveness of such interventions, conceptualized as the 
costs per averted HIV or HCV infection associated with the prevention 
intervention. It does not include a more complete cost-utility model. Zaric 
and colleagues have published several analyses from a cost-utility 
perspective [57, 75]. 

The model below, like others in the policy analysis literature, is based on a 
simplified depiction of injection drug use and treatment interventions. It 
does not consider heterogeneity among IDUs in the manner, frequency, and 
social context of their injection drug use, though these characteristics are 
known to vary among IDUs [36]. It does not consider differences in 
transmission risk associated with viral load and other complex 
characteristics of infected and uninfected persons. It uses a standard, 
random-mixing model of infectious disease transmission rather than a more 
sociologically nuanced network model of needle-sharing networks [76]. It 
does not consider sexual risk among IDUs. 

Each of the above simplifications is cosdy, because each excludes something 
important for infectious disease spread. Despite these simplifications, the 
resulting model illuminates the basic trade-offs that confront policymakers, 
and it helps to identify critical parameters that determine likely policy 
success. By allowing explicit cost-effectiveness calculations, this model 
provides a simplified, but useful yardstick to compare MMT to other 
prevention efforts. 

We use the model presented by Pollack and Heimer [40] to present the basic 
story. In particular, we consider a self-contained population of some N(t) 
active IDUs. This number might vary over time as a result of prevention 
interventions to discourage drug use. New (uninfected) IDUs enter the 
population at a constant rate of 0 per day. IDUs leave the population at 
random at some constant rate of 5 per person per day. This implies that the 
average duration of an active drug use “career” is (1/8). In a particularly 
unrealistic but useful assumption, the exit rate 5 is assumed to be 
independent of both disease status and one’s previous experience as a drug 
user. Averaging estimates from Kaplan’s “needles that kill” analysis and 
those reported from among Baltimore’s ALIVE cohort, we set 5=1/(3994 
days) [53, 77, 78]. 
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We collapse all injecting equipment into a single entity - syringes - and we 
posit that IDUs freely share syringes at shooting galleries and similar venues 
that promote random mixing. 2 These circumstances promote rapid disease 
spread as susceptible IDUs encounter contaminated, potentially infectious 
injection paraphernalia. Although random mixing is a worst-case 
assumption, mathematical models indicate that it provides a good 
approximation to non-random models when there are high contact rates and 
some overlap across disparate sharing networks [56, 70, 85]. 

Drug users are assumed to frequent these locations with a constant arrival 
rate of X per unit time. The true value of X is difficult to directly observe. 
Some research has assumed that IDUs share syringes once per week [31]. 
More recent data suggest less frequent sharing, though IDUs may under- 
report the extent of sharing [86]. Infectious disease transmission can occur 
when an uninfected person shares a syringe first used by an infected person. 
When such sharing occurs, we assume a constant probability of K that the 
virus is actually transmitted. Rather than pick a specific point estimate, we 
examine a range of values across the empirically pertinent range from a low 
of K - 0.5% to a high value of k = 7.5%. The low value corresponds to 
published analyses HIV transmission, while the latter value is extrapolated 
from data from needle-stick accidents involving health care workers [57]. 

At any given time t, there are some I(t) infected individuals. The proportion 
of infected individuals is the ratio ft(t)=I(t)/N(t). 

Table 14.1 summarizes the relevant parameters and simulation values. 

14.4.1 Baseline epidemiological model 

The basic model is most readily presented in the absence of policy 
intervention. On any given day, N(t) - I(t) = N(t)[l-rc(t)] uninfected IDUs 
remain susceptible to infection. Each IDU shares at a rate of X per day. 
Given random mixing, the probability that an uninfected IDU shares a 
needle with an infected IDU is identical to rc(t), the proportion of infected 
persons within the active population of IDUs. When a susceptible shares 
with an infected person, she has probability K of becoming infected. 



2 IDUs also share ‘cookers’ and filters, and water sources contaminated by syringe 
mixing. IDUs also use previously-used syringes in ways that allow for further 
infection [79-84]. Cookers, filters, and water may be more important for HCV 
transmission than for HIV given the differences in infectivity between the two 
agents. 
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Table 14.1 Parameters and values 



Para- 

meter 


Definition 


Baseline 

Value 


N(t) 


IDU population 


See text 


I(t) 


Number of infected 


See text 


7C(t) 


HIV or HCV prevalence 


See text 


9 


Arrival rate into IDU population 


0.5/day 


X 


Arrival rate into shooting galleries 


l/(7days) 


K 


Infectivity 


0.005-0.075 


8 


Exit rate from active IDU population 


1/(4000 days) 


M 


Number of treatment slots 


See text 


C 


Treatment cost/person/day 


$14.00 


d 


Cost of SEP/person/day 


$5.00 


u 


Reduction in injection rate during treatment 


75% 


Y 


Proportional reduction in syringe sharing rate 
associated with syringe exchange 


1/3 




Exit rate from treatment 


1/(400 days) 


Analytic results 


Vo 


Present discounted value of infections without 
any MMT slots 


See text 


V M 


Present discounted value of infections given M 
MMT slots 


See text 


to 


Steady-state infectious disease incidence without 
intervention 


See text 


No 


Steady-state population size without intervention 


See text 


7Co 


Steady-state prevalence without intervention 


See text 


N # 


Steady-state population size with MMT 
implemented 


See text 


^short- 

term 


Short-term incidence decline attributable to SEP 


See text 


Allong- 


Long-term incidence decline attributable to SEP 


See text 


term 
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Combining these terms, infectious disease incidence - the number of new 
infections per day - is 

i 0 (t)=KKn 0 (t)[l-n 0 (t)]N 0 (t) (1) 

Here the subscript 0 is used to denote variables in the absence of 
intervention. The epidemic spreads most rapidly when half of the population 
is infected. At this prevalence, the number of sharing pairs that involve one 
infected and one uninfected person is maximized. 

Since some individuals exit the population of active IDUs, 

*MH=i 0 (t)-f>I 0 (t) (2) 

at 

In steady state, I 0 (t) is no longer a function of time; that is, dIo(t)dt=0. We 
therefore use the subscript 0 and omit references to t to indicate a steady- 
state value. 

Thus, I 0 = l o( 1/8). This implies that the number of infected IDUs equals the 
rate of new infections per unit time multiplied by the mean duration of 
infected IDUs within the population. 

The same analysis indicates that the proportional steady-state prevalence, 
rto=I(/No, is 



i 6 i 1 

710 kA. R 0 



( 3 ) 



The quantity (8 /kX) is the reciprocal of the reproduction number Rq. Absent 
intervention, Ry is the expected number of individuals who would be 
infected by a single infected drug user introduced into an entirely susceptible 
population. Clinical or policy interventions that drive Rq below 1.0 will drive 
steady-state prevalence to zero. Such interventions might reduce the 
frequency of needle sharing (X) through health education interventions, 
increase the rate of exit (8) from the IDU population, or reduce infectivity 
(K) through the provision of bleach to clean potentially infected syringes. 
More complex models yield different values of Rq, This parameter is 
fundamental to many epidemiological policy models of infectious disease 
spread [87]. 
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One must consider the size of the overall population of IDUs. Every day, 
some number 0 of uninfected IDUs enter the population. If there are N(t) 
IDUs at time t, some (N(t)8) will exit the IDU population every day. In 
steady state, there will be N 0 active drug users, where the number of new 
IDUs entering the population balances the number of IDUs who leave the 
population. These flows balance when population size is equal to the arrival 
rate 0 of new individuals per unit time, multiplied by the mean length of 
time (1/8) that an individual remains an active IDU: 

*0=f (4) 

Finally, one must consider steady-state disease incidence, i<>. Since steady- 
state prevalence equals incidence multiplied by duration of IDUs within the 
active population, we have lo=8lo=8No7to. This implies that 

<5) 

In cost-effectiveness analysis, the important quantity is the number of 
averted infections associated with treatment intervention. However, the 
timing of infections also matters. An averted infection five years from now is 
less valuable than an averted infection today. Given the time value of 
money, future averted infections must be discounted by precisely the same 
factor as the funds expended to finance the intervention. Given a discount 
rate r, the present discounted value of new infections is expressed 
mathematically as 

V 0 =°jio<l)e- r, dl = ] K kn 0 (l)[l-n 0 (0]No(l)e- r, dt ( 6 ) 

0 0 

Here r is a discount rate appropriate for public policy intervention. 

14.4.2 The impact of syringe exchange and methadone maintenance 
treatment 

One can augment the basic model to consider the impact of both methadone 
treatment and syringe exchange. This model abstracts from a complex reality 
to highlight the qualitative impact of both kinds of interventions. MMT is 
presumed to induce a constant exit rate from the drug-using population of [l 
per person per unit time, over and above the “natural” exit rate 5 from the 
drug-using population. MMT also reduces the rate of hazardous syringe 
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sharing among clients who would otherwise use illicit drugs. Instead of 
going to shooting galleries at a rate of X times per week, MMT clients 
frequent these places at the rate of X(l-p). Complete adherence corresponds 
to a value of (i=1.0. 

To focus on the harm reduction dimension distinctive to SEP, the 
intervention is presumed to have zero impact on the frequency of drug use, 
and no impact on the exit rate of IDUs from the population of active 
injectors. Instead of going to shooting galleries at a rate of X times per week, 
SEP clients frequent these places at the rate of 1 -y) . 

For both SEP and MMT, we assume that disease prevalence among 
treatment participants mirrors prevalence among all IDUs. On any given 
day, N(t) - I(t) = N(t)[l-7t(t)] uninfected drug users remain susceptible to 
infection. However, uninfected MMT clients who adhere to treatment do not 
share needles. Assuming that disease prevalence among methadone clients 
mirrors prevalence in the broader drug-using population, and that treatment 
reduces syringe sharing by the proportion p, we must subtract Mp[l-7l(t)] 
from the population of those at risk, leaving (N(t)-pM)[l-7t(t)] susceptible 
drug users who are actively at risk. 

Both of these factors alter infectious disease incidence to 

V 0) =KK(\- 7 Mt)[\ - nt)][N(t) - Pm; (7) 

In like fashion, 

^l = l (t)-(8 + M\L)I(‘) (8) 

at 

Each MMT “slot” costs $C per person per day in pharmaceutical costs, 
labor, and other expenses. Treatment slots are always filled. This assumption 
matches conditions of excess demand in many U.S. and European cities that 
experience long waiting list s. Following previous research, we posit that $C 
is approximately $14/person/day. Each SEP treatment slot costs some $d per 
person per day. Because SEP is a less intensive intervention, we posit that d 
is approximately $5/person/day. 

As in the baseline model, some 0 uninfected IDUs enter the population every 
day. Only now, if M IDUs receive MMT, some number (N(t)6+M^l) will 
exit the population every day. In steady state, there will be N active drug 
users, where 
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N - = 6-Afli 



( 9 ) 



Thus, one benefit ofMMT is to reduce the overall population of active drug 
users. 



Given M treatment slots, the present discounted value of new infections is 



v u = ]i (t)e~ r, dt = JkVI- y)n(t)[l - n (t)](N(l) - §M)e~ r ‘dt (10) 
0 0 

In similar fashion, the present discounted cost of maintaining M treatment 
slots in perpetuity is $Mc/r. If, considering treatment costs, the reduced 
lifespan and the reduced well-being of infected persons, one values an 
averted infection at some monetised level $S, the optimum policy is to 
choose the number of slots M that minimises SVM-Mc/r, the present 
monetised value of disease incidence minus the overall treatment cost. 

14.5 AVERAGE COST PER AVERTED INFECTION 

In comparing MMT to other prevention efforts or other competing uses of 
public funds, it is especially illuminating to calculate the average cost of 
MMT per averted infection. If there are no available treatment slots, the 
present discounted value of new infections is some (larger) quantity V 0 . So 
the average cost per averted infection would be 



Me 

r[Va-Vv] 



(11) 



Unfortunately, V M is difficult to solve analytically, though it is easily 
computed numerically in specific cases. Pollack [88] provides further 
specific results. 

Table 14.2 is drawn from Pollack [88]. It shows the results of one sensitivity 
analysis generated using these models. Compared with later analyses, 
including those by Zaric and colleagues [57, 75], Pollack [88] likely 
understates the cost-effectiveness ofMMT for HIV prevention. As discussed 
below, costs per averted HIV infection strongly increase with underlying 
HIV prevalence in the absence of intervention. MMT and other harm 
reduction interventions are highly cost-effective when applied in conditions 
of relatively low prevalence. Such interventions are markedly less cost- 
effective in conditions of very high prevalence because feasible 
interventions have only a small impact on steady-state prevalence. 
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Table 14.2 Average cost per averted Hepatitis C infection (60% of 
IDUs in MMT): k= 0. 01, varying rates of needle sharing and treatment 

adherence 



30% of MMT 
Clients Share 
Needles 


50% Treatment 
Adherence 
(M-5) 


75% Treatment 
Adherence 
(P=0.75) 


Full Treatment 
Adherence 

(P-1.0) 


90% Relapse 


$321,304 


$278,720 


$240,166 


80% Relapse 


$140,655 


5 113.083 


$103,634 


70% Relapse 


$114,072 


$104,695 


$99,540 


60% Relapse 


$107,140 


$101,912 


$98,419 


20% of MMT 
Clients Share 
Needles 


50% 

Treatment 

Adherence 


75% 

Treatment 

Adherence 


Full 

Treatment 

Adherence 


90% Relapse 


$481,932 


$418,062 


$360,236 


80% Relapse 


$210,983 


$169,625 


$155,458 


70% Relapse 


$171,108 


$157,042 


$149,306 


60% Relapse 


$160,710 


$152,868 


$147,630 


10% of MMT 
Clients Share 
Needles 


50% 

Treatment 

Adherence 


75% 

Treatment 

Adherence 


Full 

Treatment 

Adherence 


90% Relapse 


$963,556 


$836,151 


$720,491 


80% Relapse 


$421,966 


$339,249 


$310,917 


70% Relapse 


$342,217 


$314,085 


$298,613 


60% Relapse 


$321,421 


$305,736 


$295,260 



Pollack [88] assumes very high HIV prevalence (exceeding 65%) absent 
intervention. Such a model matches the observed prevalence among street 
IDUs in New Haven, Connecticut prior to implementation of syringe 
exchange. However, this analysis likely overstates HIV prevalence in later 
1DU cohorts, in which rates of needle-sharing have declined and from which 
the core group of IDUs at greatest risk may have exited the population 
through HIV infection. 

The results in Table 142 also demonstrate the value of treatment adherence, 
(J as a function of relapse rates from MMT. At high relapse rates, treatment 
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adherence must also be high for cost-effective intervention. When relapse 
rates are low, MMT appears quite cost-effective even given low adherence 
to the intervention. 

These results are also remarkable as an argument for the cost-effectiveness 
of even highly imperfect MMT interventions. In the baseline case, we posit 
that patients are 75% adherent to the treatment, and that fully 80% of MMT 
clients eventually relapse into injection drug use. None of the traditional 
(and large) social benefits associated with MMT - improved health status 
and productivity, and reduced criminal offending among MMT clients - are 
considered in this analysis. Yet the costs of MMT per averted infection are 
only $113,000. 

This estimate is far below reasonable valuations of the social and individual 
costs of HIV infection. For example, Holtgrave and Pinkerton estimate 
present discounted lifetime treatment costs associated with HIV infection to 
be $195,000 [89]. More important is the impact of HIV prevention on 
individual well-being. Holtgrave and Pinkerton estimate that HIV infection 
is associated with a loss of 7.10 quality-adjusted life-years (QALYs). Across 
a wide range of public health interventions, interventions costing between 
$50,000 and $ 150,000 per QALY are widely regarded to be cost-effective by 
policymakers and the public [90]. By this cost-utility standard, MMT 
appears highly cost-effective in virtually all of our specifications when 
compared with other public health interventions. 

As shown in Table 14.3, results are more discouraging for the prevention of 
HCV infection and other highly infectious agents. Within the same analytic 
framework, with all parameters identical except for a higher infectivity K, 
MMT has only a small impact on HCV incidence and prevalence due to the 
higher probability of HCV transmission when needle sharing occurs. In most 
cases, costs per averted HCV infection are correspondingly much higher 
than those for HIV. Given modest estimates of lifetime expected treatment 
costs for acute and chronic HCV infection, it is difficult to justify MMT 
based on its role in HCV prevention [53, 91, 92] . 

Although these results are discouraging, they also indicate the great potential 
contribution of program quality to program effectiveness. Highly effective 
MMT programs - those with low relapse rates and high treatment adherence 
- can have a strong effect on HCV spread and can be cost-effective. 

14.6 SHORT-TERM INCIDENCE ANALYSIS OF SEP 

The full analytic framework for both SEP and MMT must be solved 
numerically, and is difficult to interpret from a qualitative perspective. 
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Table 14.3 Average cost per averted Hepatitis C infection (60% of 
IDUs in MMT): k=0.03, varying rates of needle sharing and treatment 

adherence 



30% of MMT 
Clients Share 
Needles 


50% Treatment 
Adherence 
(P=0.5) 


75% Treatment 
Adherence 
(13=0.75) 


Full Treatment 
Adherence 

O-i.o) 


90% Relapse 


$724,851 


$580,067 


$450,781 


80% Relapse 


$314,433 


$ 1 80 J 62 


$81,548 


70% Relapse 


$210,434 


$118,877 


$76,188 


60% Relapse 


$163,421 


$95,055 


$75,809 


20% of MMT 
Clients Share 
Needles 


50% 

Treatment 

Adherence 


75% 

Treatment 

Adherence 


Full 

Treatment 

Adherence 


90% Relapse 


$1,087,180 


$870,020 


$676,163 


80% Relapse 


$471,641 


$270,239 


$122,321 


70% Relapse 


$315,647 


$178,314 


$114,376 


60% Relapse 


$245,130 


$142,582 


$113,697 


10% of MMT 
Clients Share 
Needles 


50% 

Treatment 

Adherence 


75% 

Treatment 

Adherence 


Full 

Treatment 

Adherence 


90% Relapse 


$2,174,360 


$1,740,102 


$1,351,685 


80% Relapse 


$943,270 


$540,473 


$244,641 


70% Relapse 


$631,283 


$356,626 


$228,733 


60% Relapse 


$490,258 


$285,163 


$227,426 



Fortunately, the short-run and steady-state implications of these models are 
tractable, and have been explored by several authors. 

The most important set of models are due to Kaplan and collaborators, and 
include the noted “circulation model” of needle exchange [93 j. The 
circulation model has been well-described elsewhere; its details will not be 
repeated here. Two features of that model, however, are noteworthy for this 
discussion. 
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First, the circulation model related specific data - observed HIV prevalence 
among needles returned to the New Haven SEP - to an underlying model of 
infectious disease transmission among IDUs. It therefore provided a more 
epidemiologically credible account of program effects than could be 
obtained through more traditional and less direct methodologies, such as 
studies that scrutinize self-reported risk behaviors among IDUs. 

Second, the circulation model explores the impact of SEP on short-term HIV 
incidence among program clients. The model assumes that SEP has little 
impact on HIV prevalence, the number of IDUs affected by the intervention, 
or the exit rate of IDUs from the active drug-using population. Within this 
framework, SEP reduces immediate HIV incidence by removing infected 
needles from the population. This effectively reduces the rate of new 
infections by reducing the product (kA) among active IDUs. 

For simplicity, assume that there are no MMT slots: infectious disease 
spread can only be reduced by SEP. If SEP reduces incidence by some factor 
y, the short-term incidence decline can be shown to be [53] 

C 2 ) 

*0 

Using this type of model, Kaplan and Heimer estimated that the New Haven 
SEP reduced short-term HIV incidence by approximately one-third. If 
steady-state prevalence is approximately 65% and 8 is approximately 
1/(4,000 days), an SEP that serves a population of 300 IDUs will experience 
a short-term incidence decline of (l/3)(300)(l/4000)(0.65)=k).01625 
infections per day, or approximately 5.9 averted HIV infections per year. 

Although this appears to be a small program effect, SEP is an inexpensive 
intervention, costing approximately $5 per client per day. This simple short- 
term model therefore yields an estimate of $5 *300/0.0 1625=$92,300 per 
averted infection. This is a highly cost-effective intervention. 

Because a highly infectious agent such as HCV has a higher rate of new 
infections than HIV, this short-term incidence model yields slightly smaller 
estimated costs per averted infection for HCV than for HIV. Unfortunately, 
as shown below (Section 14.8), such findings can be misleading because 
they fail to account for long-term effects. 

14.7 SHORT-TERM INCIDENCE ANALYSIS OF MMT 

A similar short-term incidence model is readily derived for MMT. The short- 
term impact of MMT on infectious disease incidence can be considered to be 
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the short-term reduction in the rate of new infections, assuming that 
infectious disease prevalence and the overall number of IDUs are stable. 
Expressed more formally, the short-term effect of a small addition to the 
number of MMT slots may be written as 

Jh 

dM 






n,N 



N - PM 



(13) 



At the margin, one additional treatment slot will cost $C, so the marginal 
cost per averted infection is 



C(N - pM ; 

pi 



(14) 



If one posits that M is close to 0, and applies the baseline model of SEP - 
steady-state HIV prevalence of 65% and P=0.75 - an MMT intervention that 
costs $14 per day yields an estimated cost per averted infection of $1 14,872. 

14.8 STEADY-STATE CALCULATIONS 



Explicit and tractable frameworks such as the circulation model brought new 
rigor to HIV prevention policy. However, the specific features, findings, and 
simplifying assumptions of such models, while appropriate for the HIV 
epidemic among IDUs, may prove misleading in other settings. HIV disease 
unfolds over a long period of time and is life-threatening. HIV is relatively 
difficult to transmit in any one exposure, such as a hospital needle-stick 
accident or the sharing of needles between infected and uninfected IDUs. 
When one alters these features, short-term incidence analysis may have 
important shortcomings. 

One might assume that short-term incidence analysis understates the long- 
term value of prevention. If an intervention directly prevents 100 IDUs from 
being infected this year, the intervention also benefits the sexual and needle- 
sharing partners of these IDUs. Such “downstream” infections are not 
considered in short-term incidence models. This intuition is correct for 
prevention interventions such as polio vaccination that provide long-term 
protection. However, this intuition is false when prevention interventions 
provide imperfect or temporary protection to treated individuals. If steady- 
state prevalence is quite high, many of the original 100 IDUs will become 
infected in later periods. Because a prevention intervention merely delays 
infection for some treated individuals, short-term analysis of disease 
incidence can provide over-optimistic estimates of program effectiveness. In 
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fact, ignoring “downstream” infections can either overstate or understate 
long-term program effects [54]. 

Steady-state analysis allows one to explore these claims, and to scrutinize 
the specific conditions under which short-term incidence analysis will 
overstate or understate long-term program effects [54]. The steady-state 
approach is especially suited to the analysis of rapid infectious disease 
transmission within a stable environment. As shown by Pollack [54], spread 
of a highly infectious agent such as HCV quickly approaches equilibrium 
incidence and prevalence. Such an analysis is less applicable to a less 
efficiently transmitted agent such as HIV, which displays much slower 
convergence to steady-state prevalence. 

Figure 14.1, drawn from Pollack [54], provides more specific information. It 
is computed using the needle-sharing rates and mean drug-using careers 
shown in Table 14.1. The infectivity k is allowed to vary across the 
empirically plausible range for both HIV and HCV. The figure displays the 
time required to move from 5% initial prevalence to 90% of steady-state 
prevalence across the empirically pertinent range of parameters. This 
framework overstates the time required to converge to steady state in actual 
policy settings, because HCV often reaches endemic levels before policy 
makers are able to intervene. 

At low infectivities, the time required to reach steady state is substantial. For 
example, HIV policy analysts have used the value K=0.0036 in published 
work. At this infectivity, numerical analysis indicates a convergence time of 
more than 30 years. Under these assumptions, steady-state analysis is less 
pertinent than short-term incidence analysis for public policy. Moreover, 
short-term analyses such as the circulation model yield results similar to 
those obtained through more elaborate dynamic models. Somewhat 
fortuitously, short-term incidence analysis for HIV provides an acceptable 
approximation of long-term effects. 

Convergence times rapidly decline as infectivity increases. For example, if 
K=1.5%, convergence is reached in 7.25 years. When k= 0.025, convergence 
is reached within 4.2 years. For HCV and other highly infectious diseases, 
infectivity is even higher, making steady-state analysis most pertinent to 
evaluate medium-term and long-term effects. Such rapid convergence to 
steady state is also observed empirically, for example in the high rates of 
HCV incidence among young Baltimore IDUs [71]. 
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Figure 14.1 Time to convergence in random-mixing models 




For SEPs, one can explicitly calculate the steady-state impact. Steady-state 
incidence is 



l SEP 



-f 






= 8 N, 



1 



a-y)R 0 



(15) 



Manipulating equation (15) and assuming positive prevalence, the steady- 
state change in HCV incidence is given by 

A, _ T'V _ 7g 

hng - ,em R 0 O-y)~ R 0 O-y) 

Comparing the long-term and short-term changes in incidence, short-term 
analysis will overstate steady-state program effectiveness whenever Ali on g- 
wmMUhorwcnn. This happens exactly when Ro>(2-y)/(l-y). 

When 7=1/3, the break-even point occurs when Ro=2.5, or, equivalently, 
7io=0.60. Equivalently, short-term incidence analysis will overstate steady- 
state program effectiveness whenever steady-state prevalence exceeds 60% 
in the absence of SEP. By the same logic, short-term incidence analysis will 
understate program effectiveness when steady-state prevalence is below 
60% absent SEP. 
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If one provides SEP to all active IDUs, the average cost per averted infection 
is 



ac = JM_ = 



N n d 



'■o-'sep 8No [(l-—)-(\ - 

V a-y)Ro 



)] 



_ d(\-y)Rp 

8y 



(17) 



Setting d=$5/day, y=0.333, and 8=1/(4000 days), this implies that 
AC=4O,000Ro- When R^=2.5, SEP would prevent infections at an 
approximate cost of $100,000 per averted infection. 

Pollack compares short-term and steady-state models [53, 54J. Figure 14.2 
shows these results. The y-axis indicates, in percentage terms, the amount 
that short-term analysis overstates (or understates) the steady-state impact of 
prevention interventions. At low steady-state prevalences, short-term 
incidence analysis understates long-term program effects. In such cases, 
averted secondary infections magnify the benefits of prevention 
interventions. At high steady-state prevalences, the opposite effects occur. 
Although incidence declines in the short-term, individuals who received 
short-term protection are likely to become infected later. Thus, the long-term 
impact of intervention is much smaller than one would predict based on 
short-term program effects. 

One can conduct a similar steady-state analysis of MMT. In steady state, 
there will be N* active drug users, with steady-state prevalence K. Every 
day, some 0 uninfected individuals initiate drug use, while (N*8+Mp) IDUs 
leave the population. So 

(18) 

0 



When steady-state prevalence is positive, one can show after algebra that 



*0 



0 



e-M/'n+ps; 



(19) 



As before, the quantity (8 /k X) is the reciprocal of the reproductive rate of 
infection, or Ro. 



The quantity 0/[0-M(p+(J8)] reflects the reduction in disease prevalence 
attributable to treatment. The quantity (p+p8) captures the effect of MMT on 
increasing exit from the drug-using population (p), and also includes the 
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Figure 14.2 Bias in short-term incidence estimation for modest 
interventions (negative values indicate understatement of program 

effect) 




effect of treatment on reducing needle sharing while individuals are in 
treatment (PS). 

One can show that steady-state disease incidence is given by 



i (M) = e n = 0 1- 



Ro 0-A//JI + P87 



Since treatment costs $c per client per day, the total cost of drug treatment is 
$Mc per day. At positive steady-state prevalence, average cost per averted 
infection is therefore* 



cM ( C 
l„-l *(M) 1^ + P 8 




( 21 ) 



Costs decline as exit rates of IDUs attributable to treatment intervention, 
(p+p8), increase. Costs decline with the number of treatment slots (M), and 



If steady-state prevalence goes to zero, the average cost per averted infection is 
given by AC=cMR</[0(Ro-l)]. 
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depend on (|X+p8)/0, the ratio of exits due to treatment over the arrival rate 
of uninfected people into the IDU population. 

As with SEP, the cost per averted infection is proportional to Ro. Thus, 
measures that reduce steady-state prevalence can be significant, even if 
steady-state prevalence remains high. Suppose, for example, that steady- 
state prevalence 7Co is 90% prior to any intervention, and that the average 
cost associated with MMT per averted infection is $100,000. If, independent 
of MMT, one could reduce needle sharing rates or other risks to reduce TCq to 
85%, this small change in prevalence would reduce Ro from a value of 10 to 
a value of 6.67. This apparently small prevalence decline corresponds to a 
one-third improvement in the cost-effectiveness of MMT. 



When the number of treatment slots is extremely small compared to the 
population of IDUs, the average costs per averted infection is 




( 22 ) 



If one applies the figures for HIV prevalence discussed above (Ro=2.5), the 
average cost per HIV infection in steady-state is approximately $15,000 - a 
figure far below that obtained by short-term analysis. Because MMT reduces 
the overall size of the IDU population and reduces the steady-state 
prevalence of infection, short-term incidence analysis understates the value 
of MMT. 



Note also that MMT has economies of scale. Steady-state prevalence, 
incidence, and the average cost per averted infection all decline as a larger 
fraction of active IDUs is served. Broad provision of MMT assists individual 
clients. It also generates a kind of herd immunity - creating beneficial 
spillovers to reduce prevalence among all IDUs [88]. 

Sometimes - but not always - broad provision of MMT can drive steady- 
state prevalence to zero. Given imperfect adherence, an epidemic can 
survive at positive steady-state prevalence even when all IDUs are enrolled 
in MMT. Setting M=N*, it is possible to drive prevalence to zero exactly 
when p > 8[Ro(l-p)-l]. 

14.9 CONCLUSIONS AND FUTURE RESEARCH 

Many insights for public policy can be drawn from the epidemics of 
substance abuse and HIV/AIDS. Operations researchers have provided many 
of these insights, and have the tools to critically scrutinize these insights 
when they are applied to new problems in new ways. 
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Operations researchers have provided data and methodologies that allow fair 
comparison of competing strategies to reduce illicit substance use. For HIV 
prevention, policy models allow policymakers to evaluate public 
investments in MMT and SEP by many of the same impact and cost- 
effectiveness standards as other public health measures. When such 
comparisons are made, HIV prevention interventions for IDUs compare 
favorably to prenatal care, car safety seats, and other widely accepted 
interventions [94]. 

Some lessons learned from HIV may not apply to other problems. 
Opponents of harm reduction argue that measures to make substance use 
safer are a foolish and ineffective response to the individual and social harms 
associated with injection drug use. According to this view, “use reduction” is 
essential to achieve lasting social benefit. The effectiveness and cost- 
effectiveness of SEP for HIV prevention provides a strong rebuttal of such 
use-reduction arguments. The need for use reduction appears more 
compelling when one considers more infectious agents such as HCV [28]. 

Public health challenges facing IDUs raise new challenges for both 
operations researchers and for policy. 

The impact of high street purity on drug use behavior and drug treatment 
outcomes remains unknown. Many heroin users now consume the drug in 
non-injectable form. If such drug use is stable over time, non-injectable 
forms of heroin use may help to slow blood-borne epidemics among IDUs. 
Yet if non-injecting heroin users frequently transition to injection, the rise of 
heroin snorting and other behaviors may be a significant problem for both 
substance abuse policy and public health. In one study of Baltimore IDUs, 
only one-fourth of respondents had initiated heroin by injecting. Yet two- 
thirds of respondents reported some injection drug use. The most durable 
changes in route of heroin administration were towards high-risk behaviors 
[95]. 

The impact of improved HIV treatments raises more complex concerns for 
the design and operation of harm reduction and treatment interventions. 
Improved treatment lengthens life, may lengthen the period of high-risk 
behavior among IDUs, and may also reduce the probability of HIV 
transmission when there is needle sharing between infected and uninfected 
IDUs. The impact of such therapies has spawned a large literature in 
epidemiological policy modeling [87]. The spread of multi-drug-resistant 
strains has also attracted attention [96]. All of these developments heighten 
the importance of long-term prevention interventions for HIV-infected 
IDUs. 
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The histories of the substance abuse epidemic and the HIV epidemic are 
tragic in many ways. In the U.S., HIV occasioned late and inadequate policy 
responses to an epidemic afflicting IDUs and other stigmatized groups. This 
led to much avoidable mortality and morbidity among IDUs, their sexual 
partners, their children, and others [97, 98]. In the case of illicit drug use, 
policymakers continue to favor law-enforcement policies that are more 
punitive and less cost-effective than best-practice prevention or treatment 
interventions. 

The most important reasons for these policy failures lay outside the 
immediate realm of policy analysis: they have arisen due to the quality of 
public management, moral and ideological choices, and interest-group 
politics that do not favor the groups at greatest risk. The best policy analysis 
is often powerless to overcome these factors. Sigmund Freud once 
commented that the voice of intellect is soft, but will not rest until it gains a 
hearing [99]. In this quiet but insistent way, operations researchers remind 
skeptical citizens and policymakers of the value of sound interventions that 
reduce premature death and avoidable suffering among IDUs. 
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SUMMARY 

Development of a vaccine remains the best hope for curtailing the 
worldwide pandemic caused by human immunodeficiency virus (HIV) 
infection. Due to the complex biology of HIV infection, there is increasing 
concern that an HIV vaccine may provide incomplete protection from 
infection. In addition to reducing susceptibility to disease, an HIV vaccine 
may also prolong life in people who acquire HIV despite vaccination, and 
may reduce HIV transmission. We evaluated how varying degrees of 
vaccine efficacy for susceptibility, progression of disease, and infectivity 
influence the costs and benefits of a vaccine program in a population of men 
who have sex with men, We found that the health benefits, and thus cost 
effectiveness, of HIV vaccines were strikingly dependent on each of the 
types of vaccine efficacy. We also found that vaccines with even modest 
efficacy provided substantial health benefits and were cost effective or cost 
saving. Although development of an HIV vaccine has been extremely 
difficult, even a partially effective HIV vaccine could dramatically change 
the course of the HIV epidemic. 
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15.1 INTRODUCTION 

At the end of 2002, 42 million people were living with human 
immunodeficiency virus (HIV) infection. New infections were occurring at 
about 14,000 per day [1]. By 2010, an additional 45 million people will 
become infected with HIV if current trends continue [1]. Highly active 
antiretroviral therapy is very effective, but it is unavailable in most low- 
income countries where 95% of the HIV infections occur. Development of 
an HIV vaccine remains the best hope for curtailing the worldwide 
pandemic. 

Despite intensive effort, development of an HIV vaccine has remained 
elusive. Many candidate vaccines have undergone clinical trials, but only 
one vaccine, AIDSVAX, has undergone large-scale. Phase III efficacy trials 
that are required for vaccine licensing. Preliminary results of the first 
AIDSVAX trial, reported in early 2003, indicated that the vaccine failed to 
reduce HIV infection rates in the overall group of vaccine recipients. In 
subgroup analyses, the manufacturer reported that the vaccine reduced HIV 
infection rates by 67% in non-Hispanic minorities, and by 78% among black 
recipients. The subgroup analyses were highly controversial because of 
small sample sizes. Even if these results become accepted, however, they 
would further confirm the belief among many experts that if a vaccine 
becomes available, it would likely provide only partial protection from HIV. 
This view led the Centers for Disease Control and Prevention and the World 
Health Organization to hold consultations to examine how partially effective 
HIV vaccines should be used [2]. 

The increasing concern that an HIV vaccine would be only partially 
effective has led to considerable interest in how to model vaccine efficacy 
(VE) for HIV vaccines. Vaccines for HIV may act to reduce the burden of 
disease in three ways. First, the vaccine may reduce susceptibility to 
disease, as do most familiar vaccines. In a framework developed by Longini 
and colleagues [3-5], this component of vaccine efficacy is termed the 
vaccine efficacy for susceptibility, VE*. Because the VEg is likely to be less 
than 100% (because of incomplete protection), a person who has been 
vaccinated may subsequently become infected with HIV. Unlike some 
traditional vaccines, an HIV vaccine may also ameliorate disease in those 
who become infected. The vaccine would likely work by improving the 
ability of the immune system to suppress HIV viral replication. This 
suppression would lead to the two additional means by which the vaccine 
could reduce the disease burden from HIV infection: the vaccine could slow 
progression of HIV disease and decrease the likelihood of transmission of 
HIV. Transmission would likely decrease because the probability of 
transmission is related to the level of virus in the blood: transmission occurs 
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less readily if the level of virus in blood (and other body fluids) is low [6, 7J. 
Thus, an HIV vaccine may have efficacy to reduce progression of disease, 
VE P , and efficacy to reduce injectivity VEj [3-5], in addition to efficacy to 
prevent infection (VEg). 

Should a partially effective vaccine be used? How good must a vaccine be 
before public health officials recommend its use? These questions are 
complex, in part because a vaccine with low VE a might still have substantial 
health benefit if either VEp or VEj were high. In addition, because a vaccine 
program would require substantial resources, the question of whether to use 
the vaccine also depends on the costs of the program. To address these 
questions, we developed a dynamic transmission model to represent the 
effects of a vaccine in a population, and an economic model to assess the 
costs associated with the vaccine program [8-10]. We modeled two types of 
vaccines: a preventive vaccine (VE S > 0, VE P = 0, VEj = 0) that would be 
given to uninfected people, and a therapeutic vaccine (VE S = 0, VEp > 0, VEj 
> 0) that would be given to people known to have HIV. By evaluating both 
types of vaccines, we can understand how VE S , VE P , and VEj influence both 
the health benefits and costs of a vaccine program. We evaluated the costs 
and benefits of these vaccine programs in a population of men who have sex 
with men (MSM) designed to reflect the population in San Francisco, 
California. 

This chapter builds on previous work we have done in evaluating potential 
HIV vaccines [8-10]. We have recast our previous work into a framework 
for analysis of vaccines that has recently developed. This framework 
conceptualizes vaccine efficacy in terms of efficacy for susceptibility, for 
progression, and for infectivity. In addition, the work in this chapter 
assumes no behavior change (positive or negative) in the base case. 
Arguments have been made about why risk behavior might increase or 
decrease with a vaccine program, but recent evidence has not supported the 
more pessimistic assumption that we used in earlier work that risk behavior 
would increase. Additionally, we updated costs to reflect 2003 dollars. 

15.2 METHODS 

15.2.1 Model and data 

Details about the model structure, input data, and validation are available 
elsewhere [8-10], so we provide an abbreviated overview here. A schematic 
depiction of the model is shown in Figure 15.1. The diagram in Figure 15.1 
is substantially simplified but indicates the important relationships captured 
in the model. The figure shows the vaccinated cohort for both preventive 
and therapeutic vaccines. For a preventive vaccine, infection is attenuated. 
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For a therapeutic vaccine, progression of disease is attenuated. Infectivity 
may also be reduced by a therapeutic vaccine. 

Figure 15.1 Schematic of model structure 



Preventive Vaccine Therapeutic Vaccine 

Program Program 
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conduct of cost-effectiveness analyses [12], we discounted both health and 
economic outcomes. We discounted outcomes at 5%; expenditures are 
expressed in 2003 dollars. 

We modeled the effects of vaccines and progression of disease as changes in 
the transition rate from one model compartment to another (Figure 15.1). 
For a preventive vaccine program, the rate of infection in individuals in the 
vaccinated cohort is attenuated by VE S , which we defined as the proportion 
of vaccine recipients who are protected from infection. For a therapeutic 
vaccine, we assumed for simplicity that the efficacy of the vaccine in 
reducing progression of disease (that is, in prolonging life), VEp, occurred 
via prolongation of life during the asymptomatic phase of infection. We 
varied the degree of this increase from one year to ten years. We also 
modeled changes in infectivity of vaccine recipients (VE{ ) as reductions in 
transmission to contacts. We modeled disease progression (for both types of 
vaccine programs) from asymptomatic disease, to symptomatic disease, and 
then to AIDS (not shown in Figure 15.1). In addition, we modeled 
interactions of the vaccinated cohort with the uninfected people in the 
population. In the analyses we report here, we assumed that vaccine 
recipients would not change risky behavior. We have evaluated the 
importance of behavior change previously [8-10]. 

We calculated the incremental cost effectiveness of the vaccine program as 
the difference in costs between the vaccinated and unvaccinated cohorts, 
divided by the difference in health benefit. For example, to estimate the cost 
effectiveness in dollars per QALY gained, we calculated the cost- 
effectiveness ratio as: 



($ vaccinated $unvaccinatcd ) / (QALY S va ccinated ~ QALY S un vaccinated) 

If the vaccine increased both costs and health benefit, we calculated the cost- 
effectiveness ratio. If the vaccine provided benefits while reducing costs, we 
said that vaccination dominated the strategy of no vaccination. 

75.2.2 Model inputs 

We estimated inputs for the model from published and unpublished data 
about the population of MSM in San Francisco [8-10]. Key input data for 
the model are shown in Table 15.1. 
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Table 15.1 Input Variables 



Variable 


Base-Case Value (range) 


Preventive Vaccine 




Efficacy for susceptibility, VEg 


10% -90% 


Efficacy for progression of disease, 


0% 


VE P 




Efficacy for infectivity, VE* 


0% 


Duration, years 


5 to 50 years 


Proportion of population vaccinated 


75% ( 1 0%- 1 00%) 


Per-person cost 


$1,000 


Therapeutic Vaccine 




Efficacy for susceptibility, VE S 


0% 


Efficacy for progression of disease, 


1-10 years of additional life 


VE P 




Efficacy for infectivity, VEj 


0%-90% reduction in infectivity 


Proportion of population vaccinated 


75% ( 1 0%- 1 00%) 


Per-person cost 


S 1,000 


Population Parameters 




Initial size 


55,800 


Prevalence, late-stage epidemic 


49% 


Mean age, years 


30 



Sources and detailed input data are provided elsewhere [9-11]. We 
estimated transmission probabilities from epidemiologic studies and model- 
based estimates. We evaluated vaccine programs in two types of epidemics, 
an early-stage epidemic and a late-stage epidemic. In an early-stage 
epidemic, the prevalence of HIV infection is relatively low (10%) but 
increasing. Such an epidemic may reflect younger MSM who have higher 
levels of risky behavior and higher number of annual partnerships. In a 
late-stage epidemic, the prevalence is relatively high (approximately 50%) 
and is decreasing. Such an epidemic may reflect older MSM who have 
lower levels of risky behavior and fewer partnerships. We report here 
results for the late-stage epidemic; we evaluated early-stage epidemics 
elsewhere [8-10]. Because we estimated the parameters for the model in the 
era prior to the advent of highly active antiretroviral therapy, the model 
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underestimates length of life for patients who receive treatment under 
current regimens. We discuss the implications of newer therapies on our 
results in Section 15.4. 

The characteristics of HIV vaccines are, of course, unknown. Therefore, we 
evaluated vaccines with many plausible combinations of efficacy and 
duration of action. We arbitrarily assumed that a vaccine would cost $1,000, 
and varied this value widely. 

15.3 RESULTS 

15.3.1 Preventive vaccine 

We evaluated vaccine efficacy for susceptibility (VE S ) that ranged from 10% 
protection to 90%, with vaccine durations of 5, 10, and 50 years (Figures 
15.2 and 15.3). Figure 15.2 shows the health and economic outcomes for 
vaccines with varying efficacy and duration. The figure indicates the net 
increase in QALYs and expenditures (or savings) in millions of dollars ($M) 
for a preventive vaccine after 150 years, assuming no change in risk 
behaviors, in a late-stage epidemic, with 75% of the population vaccinated. 
Each point on the polygon represents a preventive vaccine with different 
efficacy and duration. Squares on the top line of the polygon represent 
preventive vaccines with efficacy of 10% to 90% and a duration of 5 years. 
Squares on the bottom line represent a vaccine with a duration of 50 years. 

The dotted lines indicate cost-effectiveness thresholds of $50,000 and 
$10,000 per QALY gained. A vaccine represented by a point on the polygon 
that falls between the two dotted lines has a cost-effectiveness ratio between 
$50,000 and $10,000 per QALY gained. A vaccine represented by a point to 
the right of the $10,000 per QALY line cost less than $10,000 per QALY 
gained. Points on the polygon below the horizontal axis represent vaccines 
that reduce net expenditures, and therefore dominate the no-vaccination 
strategy. A vaccine with any combination of efficacy and duration within 
the ranges noted will fall within the polygon in Figure 15.2. Points to the 
right of these lines cost less than the threshold rate per QALY gained. 

Our analyses indicate that vaccines need not be highly effective to have 
substantial health benefit with reasonable expenditures (Figure 15.2). For 
example, a vaccine with only 10% efficacy and duration of 5 years (the top 
left point on the polygon), resulted in expenditures of about $83 million 
dollars and a net increase of about 3,600 QALYs at a cost of less than 
$50,000 per QALY gained (Figure 15.2). A vaccine with an efficacy of 90% 
and duration of 50 years (that is, lifelong protection, represented by the 
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Figure 15.2 Long-term outcomes of a preventive vaccine 




Figure 15.3 Effect of vaccine efficacy for susceptibility on the 
cost effectiveness of a preventive vaccine 
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rightmost point on the graph) reduced expenditures by approximately $75 
million and increased QALYs in the vaccinated cohort by approximately 
34,000. Thus, a vaccine with low efficacy was cost effective by the 
standards of high-income countries, and highly effective vaccines dominated 
the no-vaccination strategy by reducing expenditures while providing very 
large health benefit. More effective vaccines reduced total expenditures 
because the cost of the vaccine program was outweighed by the savings 
associated with prevention of HIV infection. 

In Figure 15.3, we indicate more directly how changes in VEs influenced the 
cost-effectiveness ratio. The figure shows the incremental cost effectiveness 
of a vaccine program relative to the no-vaccination strategy in dollars per 
QALY gained. Vaccines with duration of 5, 10, and 50 years are shown by 
different lines. The values represent long-term outcomes, with no behavior 
change, and 75% of the population vaccinated. With a vaccine efficacy of 
10%, cost effectiveness varied from about $7,000 per QALY gained 
(duration of 50 years) to approximately $24,000 per QALY gained (duration 
of 5 years). As vaccine efficacy increased, the vaccine program became 
increasingly cost effective. When the lines in Figure 15.3 reach the x-axis, it 
indicates that at higher efficacy, the vaccine strategy dominated the no- 
vaccination strategy. Vaccination dominated the no- vaccination strategy 
when efficacy reached approximately 35%, 55%, and 90% for a vaccine 
with duration of 50 years, 10 years, and 5 years respectively. 

15.3.2 Therapeutic vaccine 

For therapeutic vaccines (VEs = 0, VE P > 0, VEj > 0), we evaluated vaccines 
that prolonged life by 1, 2, 5, and 10 years, and that decreased infectivity by 
0%, 25%, 75% and 90% (Figure 15.4). The figure indicates the net increase 
in QALYs and expenditures (or savings) in millions of dollars ($M) for a 
therapeutic vaccine after 150 years, assuming no change in risk behaviors, in 
a late-stage epidemic, with 75% of the population vaccinated. Each point on 
the polygon represents a therapeutic vaccine with different efficacy for 
prolongation of life (1, 2, 5, and 10 years) and infectivity. The square 
labeled 100% infectivity represents a vaccine that does not reduce 
infectivity; the square labeled 10% infectivity represents a vaccine that 
reduces infectivity by 90%. As in Figure 15.2, the dotted lines indicate cost- 
effectiveness thresholds of $50,000 and $10,000 per QALY gained. A 
vaccine represented by a point on the polygon that falls between the two 
dotted lines has a cost-effectiveness ratio between $50,000 and $10,000 per 
QALY gained. A vaccine represented by a point to the right of the $10,000 
per QALY line cost less than $10,000 per QALY gained. Points on the 
polygon below the horizontal axis represent vaccines that reduce net 
expenditures, and therefore dominate the no- vaccination strategy. 
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A therapeutic vaccine that extended life by two years cost less than $10,000 
per QALY gained, even if it provided no reduction in infectivity (Figure 
15.4). In contrast, a vaccine that reduced infectivity by 90% and increased 
length of life by 10 years (the rightmost point in Figure 15.4) resulted in 
large cost savings and a gain of about 28,000 QALYs in the vaccinated 
cohort. Thus, both the degree to which the vaccine prolonged life, VEp, and 
the degree to which it reduced infectivity, VEj, had a large influence on 
costs, benefits, and the cost effectiveness of the vaccine program. 

Figure 15.4 Long-term outcomes for a therapeutic vaccine 




The influence of VEp on cost effectiveness is shown in Figure 15.5 for a 
therapeutic vaccine that reduced infectivity (VEj) by 5% or 10%. The figure 
shows the incremental cost effectiveness of a vaccine program relative to the 
no-vaccination strategy in dollars per QALY gained. Vaccines that reduce 
infectivity by 5% and 10% are shown. The values represent long-term 
outcomes, with no behavior change, and 75% of the population vaccinated. 
A vaccine program resulted in expenditures of $8,000 per QALY gained if 
the vaccine only increased length of life by one year and reduced infectivity 
by 5% (Figure 15.5). Relatively small changes in infectivity had substantial 
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influence on the cost effectiveness of a vaccine program (Figure 15.5). 
Vaccines that reduced infectivity by more than about 15% dominated the no- 
vaccination strategy, and are therefore not shown in Figure 15.5. 

Figure 15.5 Cost effectiveness of a therapeutic vaccine 



Incremental Cost 
Effectiveness of 
Vaccination 




15.4 DISCUSSION 

An HIV vaccine may reduce susceptibility to disease, may prolong life in 
people who acquire HIV despite vaccination, and may reduce HIV 
transmission. A vaccine could have only one of these mechanisms of 
protection (for example, reducing susceptibility), or it could have all three. 

We evaluated preventive vaccines that provided protection from infection 
but did not prolong life or reduce transmission, and therapeutic vaccines that 
could both prolong life and reduce transmission, but provided no protection 
from infection. We evaluated these types of vaccines because of ongoing 
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research to develop such vaccines. Although a vaccine could have all three 
mechanisms of protection, we can learn much about how the mechanism of 
protection influences costs and health benefits by assessing preventive and 
therapeutic vaccine programs independently. 

The first major finding of our analysis is that each of the types of efficacy, 
VE S , VE P , and VEi is extremely important for HIV vaccines. The health 
benefits, and thus cost effectiveness, of HIV vaccines were strikingly 
dependent on each of the types of vaccine efficacy. Our analyses indicated 
that a preventive vaccine that provided only protection from infection was 
cost effective or cost saving under many plausible scenarios. Likewise, a 
therapeutic vaccine that provided no protection from infection was also cost 
effective or cost saving under most scenarios. 

An important implication of this finding is that an understanding of the 
effect of a vaccine in a population depends on assessing all three types of 
vaccine efficacy [3-5]. Longini and colleagues have discussed design of 
vaccine trials and how to augment trials so that all parameters of vaccine 
efficacy are assessed [3-5]. For a therapeutic vaccine, a trial that assessed 
only VEp would leave substantial uncertainty about the health benefit of a 
vaccine. As noted in Figure 15.4, for a given prolongation of life (VEp), the 
health benefit from the vaccine varies by a factor of three or more based on 
variation in VEj. These considerations have led to the addition of sexual 
partner studies to vaccine trials to help assess changes in infectivity. 
Because infectivity correlates with the level of virus in blood, another 
strategy is to assess HIV viral load as a surrogate measure for infectivity. 
Direct assessment of transmission is preferable, however, if feasible. 

A second major finding of our analysis is that even vaccines with modest 
efficacy provided substantial health benefits and were cost effective or cost 
saving. By traditional standards, a vaccine that prevented infection in only 
25% to 50% of recipients would be considered a failure. In contrast, an HIV 
vaccine with these characteristics provided large health and economic bene- 
fit. In part, HIV vaccines need not meet high standards of efficacy because 
the mortality from HIV is so high. In addition, our analyses assumed that 
vaccine recipients did not increase their risky behavior; we previously have 
shown that increased risk behavior reduces the benefit of a vaccine program. 
As improvements in therapy reduce HIV mortality, or if studies show that 
vaccine recipients increase risky behavior, the efficacy of vaccines may need 
to increase to provide similar cost effectiveness. Nonetheless, our analysis 
indicates that vaccines of modest efficacy would provide great benefit. 

Our work has two important limitations. First, we developed our model 
prior to the development of highly active antiretroviral therapy. Highly 
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active antiretroviral therapy has sharply reduced the mortality of HIV 
infection. The costs of HIV care are also now substantially different than 
when our model was developed. Highly active antiretroviral therapy has 
reduced hospitalizations and associated costs. However, the cost of highly 
active antiretroviral therapy may reach $15,000 per year, so drug costs have 
increased substantially. We are extending the analyses discussed here to 
account for both the reduced mortality and changing patterns of expenditures 
on HIV care. Although the quantitative results will certainly be somewhat 
different, we expect that the main qualitative findings of the current analyses 
will remain largely unchanged: all types of vaccine efficacy will be 

important, and vaccines with modest protection will likely provide 
substantial benefit. Second, as noted, recent developments in HIV vaccine 
research suggest that a vaccine may have all three mechanisms of protection. 
In future work, we will evaluate vaccines that provide all three components 
of vaccine efficacy. 

HIV now ranks as one of the most devastating pandemics in history. The 
development of a preventive or therapeutic vaccine, or a vaccine with hybrid 
characteristics, is an extraordinary public-health priority. Our evaluation 
indicates that a full understanding of the health benefit of HIV vaccines will 
require assessment of all three types of potential protection. Empiric 
assessment of the degree to which a vaccine protects from infection, and 
reduces progression of disease or transmission, will require long, complex, 
and expensive clinical trials. Fortunately, our analyses indicate that a 
vaccine need not be perfect or nearly so to have great health benefit. Even a 
partially effective HIV vaccine could dramatically change the course of the 
HIV epidemic. 
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SUMMARY 

Several challenges have arisen in childhood immunization programs as 
vaccine manufacturers have become more successful in developing new 
vaccines for childhood diseases. Their success has created a combinatorial 
explosion of choices for health-care providers and other purchasers of 
pediatric vaccines, which in turn has created a new set of economic 
problems and issues related to determining which vaccines should be 
combined into single injections and how to design lowest overall cost 
formularies for pediatric immunization. This chapter provides a review of 
how operations research modeling and analysis tools can be used to address 
a variety of economic issues surrounding pediatric vaccine formulary design 
and pricing. A summary is presented of the pediatric immunization 
problems that have been studied using integer programming models, as well 
as the assumptions used to model such problems. A description of the 
methodologies used is provided. A summary of the results obtained with 
these models for a particular pentavalent combination vaccine that recently 
gained Food and Drug Administration (FDA) approval for pediatric 
immunization is presented. Concluding comments and directions for future 
research are also discussed. 

KEY WORDS 

Pediatric immunization, Combination vaccines, Economics, Integer 
programming models 
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16.1 INTRODUCTION 

The United States recommended childhood immunization schedule has 
become increasingly complex. For example, the 2002 schedule required no 
less than five clinic visits and 19 injections over the first 18 months of life 
[1], with each clinic visit scheduled around the specifications and 
requirements associated with the vaccines. Other constraints that may lead 
to additional clinic visits include a child’s tolerance to multiple injections 
during a single clinic visit [2], and parents’ (or guardians’) ability to take the 
time from their jobs to make immunization visits [3]. These obstacles often 
lead to noncompliance with the recommended childhood immunization 
schedule, increasing the risk to children of contracting the diseases that the 
vaccines were designed to prevent, resulting in a tremendous cost burden on 
the nation’s already stressed health-care system. 

These problems are further exacerbated by biotechnological advances that 
have led to new pediatric vaccines that must be incorporated into the already 
overcrowded recommended childhood immunization schedule. For 
example, the four recommended doses of oral polio vaccine (OPV) in the 
1996 recommended childhood immunization schedule were replaced with 
four injections of inactivated polio vaccine (IPV) in the January 2002 
schedule [1]. In 2000, four doses of a new 7-valent conjugate vaccine for 
pneumococcal disease (PNU cn _ 7 ) were recommended to be included in the 
recommend childhood immunization schedule [4]. Meeting the guidelines 
set forth in the recommended childhood immunization schedule may now 
require up to five injections at each of three recommended immunization 
visits (2, 4, and 6 months) in the first year of life. 

New pediatric vaccines that gain Food and Drug Administration (FDA) 
approval and are added to the recommended childhood immunization 
schedule by the Advisory Committee on Immunization Practice (ACIP) will 
exert pressure to increase both the volume and the frequency of 
immunization visits, and hence further escalate the costs associated with 
well-baby care (i.e., routine medical care check-ups during the first few 
years of life). One approach to circumvent this approaching problem is to 
create a single-dose oral vaccine that immunizes children at birth from all 
childhood diseases [5]. A more realistic solution is to combine two or more 
vaccines to reduce the required number of injections and clinic visits [6, 7]. 
Assuming equivalent efficacy of combination vaccines compared to their 
monovalent counterparts, significantly less time for clinic visits would be 
required of parents. This may in turn result in higher immunization 
compliance rates and an ensuing decrease in the number of children afflicted 
with childhood diseases. 
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Combination vaccines have their own unique set of problems [8J. From the 
vaccine manufacturers’ point of view, determining which vaccines to 
combine (based on biological compatibility of the antigens) and how to 
administer the vaccines (both in sequence and in timing) to ensure that 
immunity is achieved without compromising safety are important questions 
that need to be resolved. Moreover, the issue of extra vaccination (i.e., 
administering vaccine components that are not required by the childhood 
immunization schedule during specific vaccination periods) and the degree 
to which it should be tolerated (so as to minimize any associated negative 
side effects and the cost of administering unneeded vaccine components) 
must be addressed. Lasdy, the overall objective of designing an economical 
package of vaccine types and brands to stock for a particular immunization 
environment must be addressed. Identifying solutions to these problems can 
be daunting for even the most experienced pediatric health-care researchers 
and professionals. Nonetheless, combining antigens that provide protection 
against multiple pediatric diseases into a single injection has been 
recognized by pediatric health-care providers to be an advance of significant 
benefit. In fact, licensed combination vaccines are officially preferred over 
their individual component vaccines because they reduce the pain and 
suffering associated with multiple injections [9]. 

This chapter reviews the application of integer programming models to 
address the design of economical pediatric vaccine formularies. These 
models were initially introduced to provide quantitative tools for use by the 
Centers for Disease Control and Prevention (CDC), health-care providers, 
insurance companies, and parents. Such tools can help them make well- 
informed vaccine formulary decisions [10, 11]. The models are designed 
using the principle that decisions based on purchase price alone, ignoring the 
economic value of distinguishing features among competing vaccine 
products, can be more costly in the long run. The resulting integer 
programming models assemble from among all available vaccine products at 
their market prices the vaccine formulary that provides the best value within 
the constraints of the immunization schedule, achieving the lowest overall 
cost to society or to any other desired perspective. The models select from 
among a set of monovalent (i.e., single antigen) and combination vaccines 
those products that should be used at which scheduled visits within the 
recommended childhood immunization schedule [1]. 

The chapter is organized as follows: Section 16.2 summarizes the pediatric 
immunization problems that have been studied using integer programming 
models. Section 16.3 provides a description of the methodologies used, as 
well as the assumptions used to model the problems. Section 16.4 

summarizes the results obtained with these models for a particular 
pentavalent combination vaccine that is well positioned to become available 
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for pediatric immunization in the near future. Section 16.5 provides 
concluding comments and directions for future research. 

16.2 PROBLEM DESCRIPTION 

Weniger et al. [10J report the results of a pilot study that shows how integer 
programming models can be used to design optimal pediatric vaccine 
formularies for a subsection of the recommended childhood immunization 
schedule. The concept “optimal” here refers to a vaccine formulary that 
provides the best economic value, considering more than just vaccine prices 
alone. The authors present an integer programming model to assist vaccine 
purchasers in making decisions about which vaccine products to include in 
their formulary (i.e., to stock in their inventory). The model takes into 
account not only the price of the vaccines, but also the cost of a clinic visit, 
the time to prepare a vaccine for injection, and the cost of administering 
injections. Jacobson et al. [11] report the technical details of these models. 
They also list several different vaccine formularies obtained by solving the 
model under a variety of economic criteria. The model does not make 
decisions about a specific vaccine product in isolation but, rather, assembles 
from among all competing monovalent and combination vaccines the 
formulary that satisfies the recommended childhood immunization schedule 
at the lowest overall cost to society (or, if desired, to any more limited 
perspective, such as the payer of direct health costs). The model’s design is 
based on the principle that purchase price alone is just one of many factors 
with economic consequences that should be taken into account. The key 
contribution of this study is the result that it may be myopic to use vaccine 
prices as the sole factor to determine which vaccines should be purchased 
and that other costs within the system can be captured and used to identify 
vaccine formularies that provide a good overall value. 

To encourage and evaluate new investment and research by the 
pharmaceutical industry for innovative and new vaccine products, Sewell et 
al. [12] use the integer programming models to reverse engineer the price of 
various combination vaccines using an iterative bisection search algorithm 
[13]. This algorithm is detailed in Figure 16.1. The algorithm determines 
the maximum inclusion price of each combination vaccine, with and without 
the perinatal dose of hepatitis B (i.e., a dose administered at birth), across 
five injection cost variations. This involves dividing arbitrary, extreme 
upper and lower values for the maximal price into equal-sized upper and 
lower ranges, and then solving the integer programming model to determine 
which of these ranges contains the maximal price. The identified range is 
then divided in half, and the process is iteratively repeated until the 
algorithm converges when the final upper and lower range is less than $0.01, 
revealing the maximal price to the nearest one cent. Note that for a bisection 
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Figure 16.1 Reverse engineering algorithm 

Inputs: c 1 = cost of administering an injection. 

d = desired number of doses of the combination vaccine in the 
lowest overall cost formulary. 

Goal: Compute the maximal inclusion price (i.e., the price at which d 

doses of the combination vaccine will be used in the lowest cost 
formulary). 

High = $500 

(Zero doses of the combination vaccine will be used at this price). 
Low = $0 

(As many doses of the combination vaccine as possible will be 
used at this price). 

Repeat 

Mid = (High + Low) / 2. 

Set the price of the combination vaccine equal to Mid. 

Solve the integer program for the lowest overall cost formulary. 
If fewer than d doses of the combination vaccine are used in the 
lowest cost formulary, 

Set High = Mid. 

Else 

Set Low = Mid. 

Until (High -Low) <0.01 

Output: Low is the desired maximal inclusion price of the combination 
vaccine. 



search to be effective, a well-behaved cost function is required, such as one 
that is convex over the feasible region of possible formularies, which was 
the case for this study. 

Reverse engineered prices for combination vaccines provide vaccine 
manufacturers with guidelines on how well future vaccine combinations may 
compete in the market, and hence provide information that can be used to 
determine how long it may take to recoup research and development 
investments in such products. In recent years, several pentavalent and 
hexavalent vaccines built around diphtheria, tetanus, and acellular pertussis 
(DTP a ) backbones have been under development and their developers are 
positioning them to gain FDA approval [14]. Sewell and Jacobson [15] 
provide technical details of the reverse engineering algorithm that 
determines the maximum price at which different combination vaccines 
provide an overall economic advantage, and hence belong in a lowest overall 
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cost formulary. Jacobson and Sewell [16J incorporate the reverse 
engineering algorithm into a Monte Carlo simulation model to determine 
probability distributions for the price of four combination vaccines. Health- 
care providers and parents each place a different value (hence cost) on each 
injection administered (or avoided). Therefore, for a given set of injection 
costs there exists a maximal price at which a combination vaccine joins the 
lowest overall cost formulary (i.e., provides a good economic value). This 
maximal price can be determined by iteratively solving the model in 
Jacobson and Sewell [16]. Monte Carlo simulation is used to sample the 
injection costs from a set of probability distributions, where each probability 
distribution corresponds to the values that a population of parents may place 
on administering or avoiding an injection. The resulting set of maximal 
prices for each combination vaccine is used to create an empirical 
distribution that estimates the probability distribution of maximal prices for 
that combination vaccine. This probability distribution can be used, for 
example, to estimate the proportion of a population of parents who are 
willing to purchase the combination vaccine at a given price. Therefore, the 
maximal price probability distribution provides marketing information for 
vaccine manufacturers. Jacobson and Sewell [16] also use different 
injection cost probability distributions to assess the sensitivity of the 
maximal price distribution to the form of this probability distribution. 

16.3 METHODOLOGY AND ASSUMPTIONS 

A generic integer programming model is presented that captures the first 12 
years of the 2002 recommended childhood immunization schedule for 
immunization against any subset of childhood diseases covered by the 
recommended childhood immunization schedule (which currently includes 
hepatitis B, diphtheria, tetanus, pertussis, Haemophilus influenzae type B, 
polio, measles, mumps, rubella, varicella, and pneumococcus). This model 
is an extension of the integer programming model introduced in the pilot 
study reported in [11]. The generic integer programming model is as 
follows: 

Parameters 

V = set of vaccines that may be administered 

B = { A VP, GSK, MRK, WYE ) = set of manufacturers (brands) 

b v = brand of vaccine ve V 

f = j HBV, DTP a, Td, HIB, IPV, MMR, VAR, PCV } = set of standard 
sets of antigens 
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T v = set of standard sets of antigens contained in vaccine ve V 

M' ’ = {0-1, 2, 4, 6, 12-18, 60, 144 ) = set of months in which vaccines 

may be administered 

M v = set of months in which vaccine ve V may be administered 

c x = cost of administering an injection 

c v = cost of visiting a clinic 

C v = cost of vaccine ve V including the preparation cost = price of 
vaccine plus preparation cost 

X = {(v,m): ve V, meM v } = set of pairs (v,m) such that it is 

permissible to administer vaccine ve V in month me M v 

X, = {(v,m)eX: te T v } for all te T* 

Variables 

f l if vaccine v is given in month m „ „ , , „ 

Oelse fy.mj&X 



HibSkip6 = 



if HIB can be skipped in month 6 due to using MRK Hib 
in months 2 and 4 
otherwise 



s m = number of shots (injections) given in month m, for all me M* 

fl if the clinic is visited in month m „ 
v =s , for all meM* 

|0 else 



Objective Function 

min I c v x vm + I_c v v m + I*. 

meM meM 

Constraints 





1 if perinatal dose of HB V is to be given 
0 else 



(i) 



( 2 ) 
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I *v.2 + I 
(v,2 yeXm (v.4)eAr H5K 



. Ji ^ v - 6 + „ ^ ■*», 12-18 ^ 1 

(v,6)eA' H(S y (v,I2-l8)eA' MK 



I *,^2 

(v,in)6 X HB y :m=0— 1.2, or 4 



I x tm >l 

<’***«* , m = 2,4,6,12-18,60 



£ *v,i« ^ 2 

(v,l«)e^„ 



2 *^i 
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HibSkip6+ X x v4 Sl 

(v.6)eAr w# ’ 



I 

<v -"i' jr «’ , m = 2, 4, 60 



I ^21 

(v.m)eA' w :»i=6or 12-18 



1 *,„S1 

<'***>cr , m = 2,4,6,12-18 

X x„ „ > 1 

(v ’”" ejr “» , m = 12-18, 60 

2 

(v.isjba:^ 



X \ 2 s 2 ^ 

(v,m)6Ar orA i 






, m = 2,4,6, 12-18, 60 for all 6 g B 



J »= 2 

for all meM* 



(3) 

(4) 

(5) 

(6) 

(7) 

( 8 ) 
(9) 

( 10 ) 

(ID 

( 12 ) 

(13) 

(14) 

(15) 

(16) 



lOv >s 

Hi m 



for all m eM* 



(17) 
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I x vm >HibSkip6 

a a mts 



{v,m)sX H „:b,=h1RK 


, m = 2, 4 


(18) 


£ ^ 


- HibSkip6 <> 1 




(v ,m)s X uiri- 


b v =MRK,m=2 or 4 




(19) 


£ *v.™= 5 


(v,m)eX DT p t , 




(20) 




(v,m)EXjsT, 


for all te T *, for all meM* 


(21) 



The objective function of the integer programming model is given by (1). 
The constraints of the model are as follows. Constraint (3) requires that 
HBV must be given in month 2 or 4, and constraint (4) requires that HBV 
must be given in month 6 or month 12-18. Constraint (5) says that the first 
two doses of HBV must be given in months 0-1, 2, or 4. Constraint (6) says 
that DTP a is required in months 2, 4, 6, 12-18, and 60. Constraint (7) says 
that Td is required in month 144. Constraint (8) says that HIB is required in 
months 2, 4, and 12-18, while constraint (9) says that HIB is required in 
month 6 unless MRK HIB is used in months 2 and 4. Constraint (10) says 
that IPV is required in months 2, 4, and 60, while constraint (11) says that 
IPV is required in months 6 or 12-18. Constraint (12) says that PCV is 
required in months 2, 4, 6, and 12-18. Constraint (13) says that MMR is 
required in months 12-18 and 60. Constraint (14) says that VAR is required 
in month 15. Constraint (15) enforces DTP a brand matching. Constraint 
(16) calculates the number of shots given in each month. Constraint (17) 
ensures that the clinic is visited in each month in which a vaccine is 
administered. Constraint (18) says that HIB can be skipped in month 6 only 
if MRK HIB is used in months 2 and 4; if this is so, then constraint (19) sets 
the variable HibSkip6 to 1 . Constraint (20) says that DTP a extravaccination 
is not permitted. Finally, constraint (21) says that two doses of the same 
standard set of antigens are not allowed in the same month. 

To illustrate the use of this model, the pentavalent combination vaccine - 
comprising vaccines for diphtheria, tetanus, acellular pertussis, hepatitis B, 
and inactivated polio (labeled DTP a -HBV-IPV) - is analyzed to determine 
the number of doses of the vaccine that earn a place in the lowest overall 
cost formulary at varying price levels. This particular combination vaccine 
was chosen since it is well-positioned to become available for pediatric 
immunization in the near future. 

Several assumptions are made that provide boundaries for the scope of the 
results presented. Unless otherwise noted, the assumptions in Sewell et al. 
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[12] are used. The federally negotiated vaccine price list (effective August 
9, 2002) is used by four pharmaceutical companies (labeled AVP = Aventis 
Pasteur, MRK = Merck, GSK = GlaxoSmithKline, WYE = Wyeth-Lederle) 
that manufacture all the vaccines that are currently licensed and under 
federal contract with the CDC for childhood immunization. These four 
vaccine manufacturers produce 14 vaccine products that protect against the 
11 diseases (labeled HBV for the hepatitis B vaccine, DTP a for the 
diphtheria, tetanus, acellular pertussis vaccine, HIB for the Haemophilus 
influenzae type B vaccine, IPV for the inactivated polio vaccine, MMR for 
the measles, mumps and rubella vaccine, PNU cn _7 for the pneumococcal 
vaccine, and VAR for the varicella vaccine). 

The cost function for the integer programming model includes 

• the purchase price of all licensed vaccines under federal contract, 

• the cost of each clinic visit, 

• the cost of vaccine preparation by medical staff, 

• the cost of administering an injection. 

Values used for these costs are shown in Table 16.1. The vaccine purchase 
prices are the federally negotiated prices as of August 9, 2002 (see Table 
16. 1). The cost of a clinic visit is set at $40, the same value used in the CDC 
pilot study and the more recent studies [10-12, 15, 16]. This cost includes 
the direct and indirect costs associated with a clinic being able to administer 
vaccines [10]. 

Vaccine preparation is assumed to require 3.0 minutes per dose for 
powdered vaccines [p]. This preparation requires vaccine reconstitution: 
diluent is drawn into a syringe, transferred into the vaccine vial, which is 
shaken, and then the liquefied vaccine is withdrawn. Liquid vaccines in 
vials [v] requiring entry with a needle to draw up into a syringe were 
assigned 1.5 minutes, and ready-to-administer prefilled syringes [s] were 
assigned 0.5 minutes. These assumptions were distributed around the 
average times observed in previous studies [17, 18]. Note that these times 
are also consistent with unpublished developing-country estimates of around 
1 .0 minute for filling and administering injections from disposable syringes 
and 80 seconds for resterilizable syringes [19]. Labor costs are set at $0.50 
per minute, as in previous studies [12, 13]; this is equivalent to an annual 
total compensation of $60,000 for a 2,000 hour work year. Therefore, the 
resulting preparation costs for powders [p], vials [v], and syringes [s] are 
$1.50, $0.75, and $0.25 per dose, respectively (see Table 16.1). Three of the 
vaccine products, one brand of DTP a , one brand of HBV, and one brand of 
IPV, are available in both pre-filled syringes and liquid vial formulations. 




